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In this work, we resolve an open problem posed by Joswig et al. J49| by providing 
an 0(Af) time, 0(log 2 (AA)) factor approximation algorithm for the min-Morse 
unmatched problem (MMUP) Let A be the no. of critical cells of the optimal discrete 
Morse function and N be the total no. of cells of a regular cell complex 1C. The goal 
of MM UP is to find A for a given 1C. To begin with, we apply an approx, preserving 
graph reduction procedure on M M U P to obtain a new problem namely the min-partial 
order problem (min-POP)(a strict generalization of the min-feedback arc set problem 
(min-FAS)). The reduction involves introduction of rigid edges which are edges that 
demand strict inclusion in output solution. To solve min-POP, we use the Leighton- 
Rao divide-&-conquer paradigm that provides solutions to SDP-formulated instances 
of min-directed balanced cut with rigid edges (min-DBCRE). Our first algorithm for 
min-DBCRE extends Agarwal et al.’s rounding procedure for digraph formulation of 
ARV-algorithm to handle rigid edges. Our second algorithm to solve min-DBCRE 
SDP, adapts Arora et al.’s primal dual MWUM. In terms of applications, under 
the mild assumption^ of the size of topological features being significantly smaller 
compared to the size of the complex, we obtain an (a) O(JC) algorithm for computing 
homology groups LA(/C, A) of a simplicial complex /C, (where A is an arbitrary abelian 
group.) (b) an 0[M 2 ) algorithm for computing persistent homology and (c) an 0(N) 
algorithm for computing the optimal discrete Morse-Witten function compatible 
with input scalar function as simple consequences of our approximation algorithm 
for M M U P thereby giving us the best known complexity bounds for each of these 
applications under the aforementioned assumption. Such an assumption is realistic 
in applied settings, and often a characteristic of modern massive datasets. Also, for 
the scalar field topology application, we discuss why the prescribed conditions for 
compatibility are natural, rigorous and general. 


1. Introduction 


Classical Morse theory | 65j67] analyzes the topology of the Riemannian manifolds by studying 
critical points of smooth functions defined on it. Morse theory has several applications and 
extensions. It is an elegant extension of maximum and minimum principles for smooth functions 
on compact manifolds. It can be considered to be a part of differential topology wherein it 
gives a way to analyze the topology of the Riemannian manifolds by studying critical points of 
smooth functions defined on it. Marston Morse established this subject while using critical point 
theory to study closed geodesics on smooth Riemannian manifolds. Its place in field of geometry 


1 The statement of assumption is made mathematically precise in the main body of the paper 
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and topology got firmly established through subsequent applications and extensions by Smale, 
Bott, Thom, Novikov, Witten, Goresky and Macpherson, Arnold and Floer among others. For 
instance, it was applied by Smale to resolve the Poincare conjecture in the higher dimensions 
and also to mathematically formulate the Pareto optimality problem in economics |65|67|86| . 


In the 90s, Robin Forman formulated a completely combinatorial analogue of Morse theory, now 
known as discrete Morse theory. Since Lewiner’s doctoral work |j57], discrete Morse theory has 
rapidly become a popular tool in computational topology and visualization communities |40p4| 80 


Forman provides an extremely readable and a compelling introduction to discrete Morse theory 
34] . Please refer to Figure 1 and Table 1 in subsubsection 2.2.3, for a quick overview of 


m 


the graph theory setting of discrete Morse theory so as to facilitate a quick foray into the core 
computer science problem at hand. 


2. Background and Preliminaries 

2.1. Approximation Algorithms 

|87] and [91) are standard texts for approximation algorithms. 

Definition 1 (Approximation Algorithm). Given a Problem 17, an a-Approximation Algorithm 
si 77 is a polynomial time algorithm that produces a solution whose value 'f'n is within a factor a 
of the value of the optimal solution Gjj, for all instances 17 j of the problem. 

For a minimization problem, Pi/(? n < a for some a > 1. 

Definition 2 (Approximation factor preserving reduction). Given two minimization problems 
IIi and 172 (with optimal solutions of values Gn± and @n 2 respectively), an approximation 
factor preserving reduction from Hi to IJ 2 consists of two polynomial time algorithms /(■) 
and g(-) s.t., 

1. For any instance .J'x of 17i, J ?2 = /(J^i) is an instance of IJ 2 s.t. 

Gn 2 {^2) < ^JZi(^l) (1) 

2. Also, for any solution t of J> 2 , s = g(J?i,t) is a solution JA\ s.t. 

Vii! (^l, s) < yn 2 {^ 2 , t) (2) 

It can be easily shown that such a reduction combined with an o-approximation algorithm 
for TI 2 gives an o-approximation algorithm for Hi. See |87| pg.348. 


2.2. Discrete Morse theory 


graph complexes 48 
of braid groups 


Discrete Morse theory has also found several applications in algorithmic combinatorics, algebra 
and geometry, such as the study of evasiveness of graph properties [33j, topological properties of 
, minimal resolutions and Grobner bases |50] , poset topology [891, homology 
and tropical geometry analogues of Lefschetz section theorem jlj. We see 
our work to be thematically related to each of these works, in the sense that, they explore 
discrete Morse theory’s intersections with algebra, geometry, combinatorics and algorithmics. 
Forman’s discrete Morse theory has been generalized in several directions including: Morse 












theory for vector Fields 132 , Witten-Morse theory 
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Novikov-Morse theory 
homology [63 and equivariant Morse theory |36| and tame flows [74|. 
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L 2 3 * -Morse 


Forman’s theory is known to hold for regular cell complexes 130 


2.2.1. Elementary Discrete Morse Theory 


Definition 3 (Simplex, Face, Coface, Simplicial Complex). A simplex cr 7 ^] is the convex hull 
of n + 1 distinct points xq, x ±,..., x n in M m n < m. A face of a simplex is the convex hull of 
some subset of set of its vertices. If r is a face of a then a is a coface of r. A finite collection 
of simplices in M m is called a simplicial complex if any two of its simplices either have no 
common points or intersect along their common face. 

Definition 4 (Boundary & Coboundary of a simplex o). Let 1C be a simplicial complex and let 
a, t be simplices of 1C. The relation ’ is defined as: t -< a {r C a & dirnr = dimer — 1}. 
The boundary S a and respectively coboundary 5 a of a simplex are simply defined as 
5 a = {t | r -< a} S a = {p \ a -< p] 

Definition 5 (Hasse graph). The Hasse graph of a complex 1C is graph whose vertices are in 
1-1 correspondence with the simplices of the complex and there is an edge between every node 
that represents simplex /3 d and a node that represent simplex iff a -< (3. Also, by the term 
d-(d-l ) level of Hasse graph H, we mean the subset of edges of the Hasse graph that join the 
d-dimensional cofaces to the (d-1) dimensional faces of Hasse graph. Please see 
Definition 6 (Discrete Morse function). Let K, denote a finite regular simplex complex and 
let L denote the set of simplices of 1C. A function T : L —> M is called a discrete Morse 
function (DMF) if it usually assigns higher values to higher dimensional simplices, with at 
most one exception locally at each simplex. Equivalently, a function T : C —> M is a discrete 
Morse function if for every a G C we have: 

(A.) Afi{cr) = jf{p € cbdo\F(p) < 3F(cr)} < 1 

(B.) A/^cr) = #{t ebdo | F(t) > -F(o-)} < 1. 

Definition 7 (Critical/Regular simplices). If (cr) = A/ 2 (cr) = 0 for a simplex a then it is 
critical, else it is regular. 

Definition 8 (Combinatorial Vector Field). A combinatorial vector field (DVF) V on C is 

a collection of pairs of simplices {(a,/3)} such that {a m -< /3l m+1 )} and each simplex occurs in 

at most one such pair of V. 

2 Simplicial and cubical complexes are special families of regular cell complexes. 

3 An n-dimensional simplex a n may be denote either as a or a n depending on whether or not we wish to 

emphasize its dimension. Moreover, vertices of a complex are the same as its 0-dimensional simplices. 


Figure 1 









Definition 9 (Gradient Vector Field). A pair of simplices {a m -< /3( m+1 )} s.t. _F(a) > F(f3) 
determines a gradient pair. Each simplex must occur in at most one gradient pair of V. 
A discrete gradient vector field (DGVF) V corresponding to a DMF T is a collection of 
simplicial pairs {a^ -< /3^ p+1 ^} s.t. {a^ -< /3^ p+1 ^} E V iff IF (ft) < IF (a). 

Definition 10 (Gradient Path). A V-path is a simplicial sequence {ctq” 1 '* , Tq U+1 \ crj; m \ 

• • • a q m \ Tq n+l \ s.t. for i = 0,... q, {<7* -< T;} E V, cr i A n y a i+ i and Oi CTj+i. A 

V-path corresponding to a DMF T is a gradient path of T. 

Theorem 11. [Forman [3 (If ]: Let 1C be a CW complex with a DMF T defined on it. Then 
1C is homotopy equivalent to a CW complex FI, such that FI has precisely one m-dimensional 
simplex for every m-dim.ensional critical simplex in K. and no other simplices besides these. 
Moreover, let c m be the number of m-dimensional critical simplices, b m the m th Betti number 
w.r.t. some vector field V and n the maximum dimension of 1C. Then we have: 

The Weak Morse inequalities Let x be the Eider characteristic. 

(I-) c 0 - ci ... + (-l) n c n = b 0 - h ... + (-1 ) n b n = 

x(£) 

(II.) For every m E {0 ... n}: we have c m > b m 
The Strong Morse Inequalities For every m E {0 ... n}: c m — c m _i ... + (—l) m co > 

b m — b m ~ i... + (—l) m 6o 


2.2.2. Morse Homology 


Theorem 12. If a <b, are real numbers, such that [a,b/ contains no critical values of Morse 
function F, then the sublevel set M(b) is homotopy equivalent to the sublevel set At (a). 
Theorem 13 (Forman |30|). Suppose a p is a critical cell of index p with F(a) E [a, b] and 
F~ 1 (a,b) contains no other critical points. Then M(b) is homotopy equivalent to 


M(a) |J e p 


where e p denotes a p-dimensional cell with boimdary e r [. 

Notation 14 (*8(/C), ffll(/C)). Given a simplicial complex 1C, 23(/C) 
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Y) Pi and 9Jt(/C) 


i =1 
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E Ci- 


Let T be a discrete Morse function defined on simplicial complex W. Let C q {W, Z) denote 
the space of g-simplicial chains, and Ai q which is a subset of C q (W, Z) denote the span of the 
critical g-simplices. Let At* denote the space of Morse chains. Let c q denote the number of 
critical g-simplices. Then we have, Ai q = Z C9 . 

Theorem 15 (Forman 
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). There exist boundary maps d q : Ai q —> A4 9 _i, for each q, which 


satisfy d q o d q +\ = 0 and s.t. the resulting differential complex 
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calculates the homology ofW. i.e. if we go with the natural definition, 

ker d n 


H q (M,d) = - 


inr5. 


4+1 


0 




Then for each q, we have H q (Ai,d) = H q (W, Z). 

Theorem 16 (Boundary Operator Computation Forman |30|). Consider an oriented simplicial 
complex. Then for any critical (p+1 )-simplex (3 set: 


d/3 = P a p a 

critical cx(p) 

Pad = £ N(l) 

ier(/3,a) 


where T((3,a) is the set of discrete gradient paths which go from a face in 3 (3 to a. The 
multiplicity IV( 7 ) of any gradient path 7 is equal to ±1 depending on whether given 7 the 
orientation on f3 induces the chosen orientation on a or the opposite orientation. With the 
boundary operator above, the complex computes the homology of complex K. 


In Theorem 13| Forman’s establishes the existence of a cell complex (let us call it the ‘Morse 
Smale complex‘) that is homotopy equivalent to the original complex. For proof details please 
refer to Forman |30|. The boundary operator in Theorem 16 for the chain complex construction 


(that is referred to as simply the Morse complex‘) tells us how to use the new CW complex 
that is built in construction described in proof of Theorem 13 Note that the Morse complex 


itself is a chain complex and not a CW complex. But the chain complex construction allows us 
to express the simplicial homology of the input complex in terms of the Morse homology of the 
Morse Smale complex. 

Theorem 1 (Critical simplex cancellation, Forman |30|34)). Let T be a discrete Morse function 
on a simplicial complex /C such that and r p are critical. Let there be a unique gradient 

path from da to r. Then there is another Morse function Q on 1C with the same set of critical 
simplices except r and a. Also, the gradient vector field associated to Q is equal to the gradient 
vector field associated to T except along the unique gradient path from da to r. 


2.2.3. Graph Theoretic Reformulation 


Given a simplicial complex 1C, we construct its Hasse Graph representation Tiic (an undirected, 
multipartite graph) as follows: To every simplex a/) G K, associate a vertex af^ G 'Hk- The 
dimension d of the simplex er y determines the vertex level of the vertex cr^ in TL/c- Every face 


incidence (t£~ , af-) determines an undirected edge (rjff , ay) in TLk:- Now orient the graph TLk. 
to a form a new directed graph LLic- Initally all edges of LL/c have default orientation. The default 
orientation is a directed edge ay —> Xy 1 G TLk. that connects a k-dim node ay to a (k-l)-dim 
node xth 1 . Finally, associate a matching A4 to graph LL/c- If an edge (t 


n 


, a. 


Hi 


G M. then, 


reverse the orientation of that edge to —> ay G TL/c- The matching induced reorientation 

needs to be such that the graph LL/c is a Directed Acyclic Graph. A graph matching on LL/c 
that leaves the graph LL/c acyclic in the manner prescribed above is known as Morse Matching. 
Table 1 provides a translating dictionary from simplicial complexes to their Hasse graphs. See 


Figure f| and Figure |2j 


2.3. Prior Work 

Since the two problems, MMMP and MM UP are NP-Hard 1491, a logical choice is to optimize 
the number of critical cells in an efficient manner is by use of approximation algorithms. Joswig 
















f(B,AB)(C,AC)(D,AD)(E,AE) > 

DGVF on AC ~ Acyclic Matching on H ~ l (CD, ACD)(CE, ACE)(BE, ABE) (BC, ABC ) \ 

Discrete Gradient^Vector on complex K, Matching on Hasse graph H of complex Acl 

ACD ACE B CE ABE 



D> tr Al 

Default orientation on Hasse graph H of complex AC Matching induced reorientation of complex AC 
Matching chosen is such that reoriented Hasse graph H remains acyclic 

Figure 1: Matching induced orientation of Hasse Graph 


Table 1: Graph Theoretic dictionary for Morse Matching 



Morse theory on cell complex AC 

Graph theory on Flasse Graph Ti/c 

1 . 

gradient Pair (a d ~ l , ( 3 d ) G V 

Matched pair of vertices (a,j3) G T-i/c 

2. 

Dimension d 

Multipartite Graph Level d 

3. 

a d ~i T d s t ( a d-l iT d) ^ v 

Default down-edge r — > a 

4. 

a d-l T d s t g Y 

Matching up-edge a —> r 

5. 

V-Path 

Directed Path 

6. 

Non-trivial Closed V-Path 

Directed Cycle 

7. 

CVF 

Matching on the Flasse Graph 

8. 

DGVF 

Morse Matching (i.e. Acyclic Matching) 

9. 

Critical Cell 

Unmatched Vertex £ 

10. 

Regular Cell £ d 

Matched Vertex £ 







Figure 2: Matching Induced reorientation. Example 2 


et al. 1491 noted that MMMP and MMUP are NP-Hard problems, and posed the approximability 
of MMUP and MMMP as open problems, by pointing out an error in Lewiner’s claim about 
inapproximability of MMUP in [57]. Recently, Burton et al. [14] developed an FPT algorithm for 
optimizing Morse functions. Some of the notable works that seek optimality of Morse matchings 
either by restricting the problem to 2-manifolds or by applying heuristics are |3|4|12|45|46p9|58l 
59 611. The approximablity of MMMP is established in |76|. Also, we provide an FPT algorithm 
for counting Morse matchings using graph polynomials in |79|. It is also worth noting that in 


m 


79 , we use the same rigid edges framework introduced in this work to formulate recurrence 


relations for Morse polynomials (that count the number of gradient vector fields on a simplicial 
complex.) 


2.4. Problem Definition & Contributions 

In computer science terms, the max-Morse matching problem (MMMP) can be described 
as follows: Consider a bdd. degree multipartite graph 77. Associate a matching induced 
reorientation to 77, such that the oriented graph 77 m is acyclic. The goal is to maximize the 
cardinality of matched (regular) nodes. For the min-Morse unmatched problem (MMUP), we find 
a Matching that keeps the graph 77™ acyclic while minimizing the no. of unmatched (critical) 
nodes. More formally: 

Definition 17 (Min-Morse Unmatched Problem). The vector field that minimizes the number 
of critical cells over the set of all DGVFs that can be defined on a regular cell complex say 1C, is 
known as Min-Morse Matching on complex JC. The Min-Morse Unmatched Problem is to 
find such an optimal Morse Matching. 

Notation 18 (<^C). For any two functions f,g if f = 0(1) i.e. / = log^ 1 ^#), then we denote 
it as f g. 














Table 2: MMUP-APX Algorithmic Contribution 



ARV based rounding 

MWUM based solution 

min-DBCRE-APX ratio 

0 ( i/log n) 

O(logn) 

MMUP-APX ratio 

0( log 3 / 2 n) 

0 (log 2 n) 

Methodology 

Leighton-Rao (Interior Point+ARV) 

Leighton-Rao (MWUM) 

Time Complexity 

Polynomial Time 

Nearly linear time 


Table 3: MMUP-APX Contribution to Applications 


How 

When 

Why 

MMUP-APX 

0(n ) 

Ratio: O(log^n). 
Resolves an open 
problem from 49 

H(JC, A) 

0{n) 

Assuming WMOC for 
arbitrary coeffs 

Persistent Homology 

0{n 2 ) 

Assuming WMOC for 
arbitrary coeffs 

Scalar Field Topology 

0(n ) 

Finds an APX-optimal 
compatible [^] 
WMDGVF. 


Definition 19 (Strong and Weak Morse Optimality Conditions(SMOC/WMOC)). Aggregate 
Betti number is the sum of Betti numbers across dimensions. For any given discrete vector field, 
the aggregate Morse number is the sum of Morse numbers across dimensions. Given a simplicial 
complex K. with 9Jt(/C) and 23 (/C) representing the aggregate Betti and the optimal aggregate 
Morse numbers respectively, we say that: 


1. We say that a family of simplicial complexes i? satisfies the strong Morse optimality 
condition when for each K, E fi, 9Jt(/C) = 0(i8 (/C)) = 0(1). 

2. We say that a family of simplicial complexes Q satisfies the weak Morse optimality condition 

when for each /C G 17, 9Jt(/C) = 0(1) i.e. when 9Jl(/C) \K\. 

Table 2. 4] and [Table 3 list the algorithms and the applications related contributions respectively. 


3. Motivation 

3.1. Why is DMT central to Algorithmic Topology in the Big Data Era? 

The computer assisted proof of Lorenz equations [68] by Mischaikow and Mrozek (along with 
the Delfinado-Edelsbrunner paper) may very well be seen as the founding stones of modern 
computational topology. From the earliest days, Mischaikow, Mrozek (and collaborators) 
have relied on reductions as opposed to working on reducing the complexity of Smith normal 
form algorithms for improving runtime. The RedHom-ChoMP application of discrete Morse 
theory [46j, report dramatic reductions in complex size using discrete Morse theory (albeit on a 
very limited dataset) while |70| applies discrete Morse theory to obtain reductions in persistent 

































homology computations. The approximability of Morse matching has been a well-known open 
problem in computational topology over the last decade. Its importance can be acutely gauged 
from the following remarks by the authors of |70|: 

“ The efficiency of our approach depends crucially on m being much smaller than n ...” 

“... constructing an optimal acyclic matching - that is, a matching which minimizes m is NP 
hard. Providing sharp bounds on optimal m values relative to n for arbitrary complexes would 
require major breakthroughs in algebraic Topology as well as graph theory. ...” 

El 

The first quotation justifies our weak Morse optimality condition for deriving complexity 
bounds. The second of the two quotations is a direct appraisal of the MMUP approximability 
problem. E] These observations may elicit the following response: Why does discrete Morse 
theory have so much computational flexibility when it comes to reductions? The author believes 
that the answer to this lies in the fact that the optimal discrete Morse function finds the minimal 
sized complex over a wide range of complexes that are simple homotopy equivalent to the input 
complex. We qualify our statement with the words wide range of complexes as opposed to saying 
all complexes because the range does not include the set of all simple homotopy equivalent 
complexes. This can be concretely concluded by observing the NP-hardness proof of Morse 
matchings provided in |49| which clearly allows us to see why the optimal collapsibility problem 
and the optimal discrete Morse function are not the same problems. 

3.2. Universal Beauty & Applicability of DMT 

Forman’s theory is unrivaled in its beauty and simplicity when compared to earlier combinatorial 
adaptations of Morse theory. Its combinatorial power comes from the fact that the function 
defined over a combinatorial space is also discrete. Its topological elegance comes from the 
fact that each pair of matched simplices in Forman’s framework corresponds to an elementary 
collapse on the simplicial complex. Riding on a spate of mathematical and computational 
applications, Forman’s theory has emerged as the definitive combinatorial analogue of Morse 
theory. Discrete Morse theory has evolved into an important tool in algebraic, geometric and 
topological combinatorics |55|75|8 9|. 

3.3. Why is DMT a prime target for methods from Algorithmic frontiers? 

The primary construct in discrete Morse theory, namely the discrete gradient vector held can 
essentially be specified as a graph matching induced reorientation of a directed Hasse graph such 
that the reoriented graph has no cycles, graph matching is one of the most studied problems in 
computer science and, in fact, the study of polynomial time algorithms began with complexity 
analysis of a graph matching algorithm by Edmonds. The acylic subgraph problem was one 
of the 20 original problems shown by Karp to be NP-complete. Moreover, the Hasse graph of 
a simplicial complex is a structured graph, namely a sparse multipartite graph. The special 
structures of sparseness and bipartiteness often lead to simpler, faster algorithms for variety 

5 where m is the sum of Morse numbers across dimensions and n is the size of the complex. 

6 The range of applications of the MMUP-APX algorithm should not come as a surprise to those in the know of 
computational topology (especially when one realizes that DMT is equivalent to looking for minimum size 
complex over a large set of complexes that are simple homotopy equivalent to it using entirely combinatorial 
methods.) 






of computer science problems. In addition to that, discrete Morse theory is also intimately 
related to evasiveness and monotone graph properties. Owing to the combinatorial (in fact, 
graph theoretic) nature of its specification, discrete Morse theory is a natural target for methods 
from classical non-numerical algorithms. 


4. MMUP-APX Algorithm: A Precis 


Now, in our case, we follow the following steps: 

1 . Reduce the min-Morse unmatched problem to a variant of min-feedback-arc-set (min-FAS) 
problem (that we refer to as min-partially oriented problem (min-POP)). Suppose we are 
given a Hasse graph Q with edge set Eg. Then, analogous to min-FAS, the end goal of 
min-POP is to minimize the number of edges removed from graph Q, namely Ex such 
that the graph G(V,£g \ Ex) is acyclic. There is one crucial difference (between min-FAS 
and min-POP) however: the edge set Eg is composed of rigid edges E-ji and normal edges 
£j\f i.e. Eg = E-ji + £j\f. Moreover, the set Ex $ E^f. So rigid edges can not be deleted. In 


section 6| (Theorem 45 and Theorem 46), we show that such a reduction can be achieved 


in nearly linear in space and time. 

2. Having reduced the min-Morse unmatched problem to a variant of min-FAS (i.e. min-POP), 
we formulate the min-POP problem as a vector program. We then relax the 0/1-edge 
constraints for normal edges. 

3. Instead of solving the min-POP directly, we interpret the SDP formulation above as as a 
variant of the min-directed balanced cut (min-DBC) problem namely the min-DBCRE 
problem. In case of the min-DBCRE problem, we can not cut any of the rigid edges. 
Otherwise min-DBCRE and min-DBC are essentially the same. It is worth noting that 
any solution to the min-FAS (and hence the min-POP) can be interpreted as a series 
of possibly sub-optimal directed balanced cuts. One can intuit this by observing the 
cut-based mechanism in |Figure 5) The details are discussed in section 5| 

4. Therefore, we use the Leighton Rao divide-&-conquer scheme that relaxes the min-POP 
constraints of the problem instance at that particular recursion level by formulating it as 
a min-DBCRE SDP and then solving this SDP. 

5. Solve+Round 


a) Method 1: ARV-rounding based. Please see Figure 3 


i. 


The SDP is solved using the Interior Point Method. 


ii. Decompose the SDP solution into individual vectors using Cholesky factorization. 

iii. Following that, we round it using Agarwal et al. method (based on ARV rounding). 
|subsubsection 9.1.3| describes how forbidden edges are handled in context of ARV 
rounding procedure, subsection 9. 2| establishes an Oy/logn-r&t io for min-directed 
balanced cut (with rigid edges) by extending Agarwal et al.’s framework to 
accommodate rigid edges. 

A. For the ARV rounding procedure to work, the SDP must geometrically embed 
the vertex set into an Zg-metric space. (See Theorem 52 and Theorem 53). 


b) 


B. The rounding procedure constitutes of projections on random hyperplanes 
and the approximation ratio is an outcome of high-dimensional measure 
concentration phonomena. 

Method 2: Matrix Multiplicative Weights Update Method based. Please see Figure 4 




















i. Use binary search to reduce the optimization problem to a feasibility problem 
(pg. 6 , Section 3 of 0). If a is the current guess for optimum vlaue of SDP then 
we either try to construct a PSD matrix that is primal feasible and has value > a 
or a dual feasible solution whose value is at most (1 + S)a for an abitrarily small 
value of 5. For every iteration, the violation-checking-Oracle starts by applying 
the rounding algorithm on the current primal solution. 

A. If the violation-checking-Oracle fails then the current iterate is primal feasible 
with value < a. 

B. If the violation-checking-Oracle succeeds then the current iterate is either 
primal infeasible or has optimal value > a. (The failure to round is so 
spectacular that the algorithm finds a definitive way to move towards primal 
feasibility.) 

C. For every guess a, if the violation-checking-Oracle does not fail for 0(p 2 lo e n /a 2 ) 
iterations, then the algorithm provides a feasible dual solution with value at 
most (1 + 6)a . 

ii. Matrix exponentiation: In each iteration, we compute the Cholesky decomposi¬ 
tion of the matrix exponential to implement the multiplicative weights update 
rule which involves matrix exponentiation. It is sufficient to compute a (1 + e)- 
approximation of the Cholesky decomposition. This is done using random projec¬ 
tions onto an 0( lo § n /e 2 ) space and subsequently using the Johnson-Lindenstrauss 
lemma. Computing only an approximate value of matrix exponentiation leads to 
drastic gains in complexity. 

iii. The principal computational bottleneck of the MWUM algorithm is a max-flow 
subroutine. The flow subroutines are used in the violation-checking-oracle to 
check if the triangle inequalities (along with the objective value constraint) 
belonging to the SDP are satisfied. Interestingly, the rationale of using flows to 
check for violations is developed as part of ’expander flow’ framework developed 
in 056). By using a nearly-linear time max-flow algorithm, we can arrive at an 
approximate solution of the SDP in nearly linear time. 





Figure 3: Divide-&-Conquer SDPAlgorithm 1: ARV Rounding based 



Figure 4: Divide-&-Conquer SDP Algorithm 2: MWUM based 




































































5. The Leighton-Rao Divide-&-Conquer Paradigm 


Not that, the min-Morse Problem is reduced to min-POP (which is essentially equivalent to min- 
FAS with rigid edges). See section 6 Now, min-FAS can be approximated by a divide-&-conquer 
procedure namely the Leighton-Rao method wherein one effectively solves the min-FAS problem 
instance by decomposing it into multiple min-directed balanced cut (min-DBC) approximation 
problem instances. See [851 Section 5.1 and Section 5.4. The approximation algorithm for 
min-DBC acts as a subroutine that adds up to approximate the min-FAS approximation instance. 
In a similar vein, one approximates the min-POP instance by using approximation algorithm 
for min-DBCRE (min-directed balanced cut with rigid edges) as a subroutine in Leighton-Rao 
applied to min-POP. 


This section explains an important algorithmic technique (Leighton-Rao divide-&-conquer) 
used in the design of this approximation algorithm. But, more importantly it also gives a bird’s 
eye view of a variety of other tools and ideas involved. In a series of breakthrough results, 
Leighton and Rao designed an elegant meta-algorithm 162 that uses divide-and-conquer strategy 
to approximate a wide range of combinatorial problems with impressive performance guarantees. 
Their algorithms also provide approximate max-flow min-cut theorems for multicommodity flow 
problems. Given a combinatorial problem ill, the time complexity and approximation ratio of 
their algorithm(s) A\ is intimately tied to an external state-of-art c-balanced cut approximation 
algorithm A 2 - The algorithm algorithm A 2 is used as a subroutine for algorithm A\ . We 
will provide a highly simplified overview that addresses only the essential underlying idea. [85] 
provides an excellent survey, which may be of interest to the more inclined reader. Given a 
minimization problem IJi on an input graph G(V,E ) of size N: 

Step 1. We solve the c-balanced cut problem on G(V, E) by application of the a—approximation. 
The factor a ensures that a single balanced cut will be at most a times the cost of the 
optimal balanced cut at that recursion level. 


Step 2. The cut will also divide the vertices of the original graph into two vertex sets Vj and 
V 2 . The edges that are not cut can be used to construct induced subgraphs G(V\,Ei) 
and G(V 2 ,E 2 ). Let |Vi| = Aq and |L^| = A^. We now apply Step 1 on G(V\,Ei) and 
G(V 2 , E 2 ). The recursion stops each time we encounter a solitary vertex. 


Please see |Figure 5 for a quick overview. In the remainder of this section we will sketch the 
details of this scheme that gives us an O(alogn) factor algorithm for problem ill. [62|85] offer 
more details. We follow the treatment delineated in [85] , Assume that we have an a-factor 
approximation algorithm for min-DBCRE. Now, observe that every solution of min-POP is a 
linear ordering (DMF) corresponding to some partial order(DGVF). Observe that irrespective of 
which linear order we choose, we may obtain a balanced cut from it and the cost of the directed 
balanced cut will be an additive part of the objective of min-POP. Please refer to Figure 5 The 
cost of this linear ordering is at least the cost of this directed balanced cut and hence the cost 
of optimal min-DBCRE is upper bounded by cost of optimal solution of min-POP. We may 
apply the same idea recursively on problems of size cn and (1 — c)n. Clearly the objective value 
of divide-&-conquer algorithm will satisfy 


A(G) < max {A(Gi), A(G 2 )} + DBCRE A p X (f/i|f? 2 ) 


(3) 









Table 4: Tabular summary of algorithms and complexity 



ARV based rounding 

MWUM based solution 

min-DBCRE-APX ratio 

O(Vlogn) 

0 (log n) 

MMUP-APX ratio 

0( log 3 / 2 n) 

0 (log 2 n) 

Methodology 

Leighton-Rao (Interior Point+ARV) 

Leighton-Rao (MWUM) 

Time Complexity 

Polynomial Time 

Nearly linear time 


Since DBCREopt < minPOPoPT> we also have DBCREapx(1?i|{?2 ) < a■ (minPOPoPT(^)) 
where a is the approximation ratio of min-DBCRE approximation subroutine . Therefore, we 
can write the above equation as 

A(G) < max {A(Qi),A(G 2)} + ot- (minPOP opt(£7)) (4) 

For each level of recursion, we incur a cost of at most a■ (minPOPoPT(^fc)) while noting that 
minPOPoPT^fc) < minPOPoPT(^)- Now, since there are [~0(logn)~| levels of recursion, (where 
base of the logarithm depends on the ratio c used in the balancing the cut). Applying a basic 
inductive argument we conclude that: 


A(G) < a-O(logn)- (minPOP opt(£7)) 


(5) 


Now, we consider the approximation ratio of min-directed balanced cut with rigid edges 
(min-DBCRE). 

To begin with note that if we are ready to forgo the nearly linear aspect of time complexity, 
then we may obtain an approximation ratio of 0(-y/logn) as proved in Theorem 60 from 
subsection 9.2 This is a direct consequence of our ability to handle rigid edges within the 


1. 


ARV framework of rounding as seen in section 9| specifically in |Theorem 59| and more 


importantly in Theorem 60|from subsection 9.2 


2. However, if we wish nearly linear time computation, we may use single commodity max- 
flows as subroutines within violation checking oracles. With this approach one obtains an 


approximation ratio of O(logn). Note that in section 10 specifically in Theorem 10.1.3 


we show how to handle rigid edge in MWUM violation checking oracle. The ability to 
handle rigid edges within MWUM framework gives us a nearly linear time algorithm for 
an O(logn) approximate solution for min-DBCRE. 

Clearly if we use ARV-based rounding for min-DBCRE as described in|Theorem 60 


tion 9.2 


m 


subsec- 


then a = 0(y/logn), giving us an approximation ratio of 0(log 3 ,/ 2 n) for MMUP. In 


contrast, if we use MWUM based solution for min-DBCRE as described in |Theorem 10.1.3 


m 


section 10 


we get a = O(logn) giving us an approximation ratio of 0(log 2 n) for MMUP. 


Please see Table 4 for a summary of complexity implications. 








































Figure 5: Divide-&-Conquer: black: forward normal edges, red: backward normal edges, 

green: rigid edges 
























6. Reduction of MMUP to min-POP 


6.1. Definitions: A Garden of Edges 

The procedure involves gadget construction with a flavor reminiscent of the Garey-Johnson 
book. Given an input graph H we construct a gadget Hr. 

Definition 20 (Rigid edges, Forbidden edges and Normal edges in gadget Hr). We have three 
types of edges, namely: 

• Rigid edges (R-edges) are a set of prespecified oriented edges whose inclusion in every 
desired output solution is made mandatory. 

• The edges complementary to R-edges are known as Forbidden edges (F-edges) and we 
enforce the prohibition of forbidden edge orientations in every desired output. 

• Normal edges (N-edges) are edges whose inclusion/exclusion is not enforced. An edge 
with an orientation complementary to an N-edge is also an N-edge. In the desired solution, 
we are free to choose either of the two orientations - an N-edge or its complementary 
N-edge. 

Definition 21 (Paths and Cycles in gadget Hr). A pathfor a cycle) in Hr composed entirely 
of R-edges is known as an R-Path(or an R-cycle). Analogously we may also define N-Paths 
and N-cycles. A pathfor a cycle) in Hr composed of R-edges as well as N-edges is known as 

an RN-Path(or an RN-cycle). 

Note 6.1. In the context of our problem, we have two type of N-edges: down-N-edges (with 
default downward orientation) and up-N-edges (associated with matching induced reorienta¬ 
tion) and and our typical objective is to either optimize or count the number of up-N-edges. 
up-N-edges are also denoted as N-edges and N_-edges respectively. The basic idea is to do the 
following: Given a Hasse graph, we enlarge the vertex set and edge set ofH to form a gadget 
Hr to ensure that for every cycle in H, there is a corresponding cycle in Hr. We refer to 
these cycles as C-cycles. More importantly, for every matching in H there is a corresponding 
cycle in Hr. We refer to these cycles as M-cycles. Rigid edges introduced to break some 
C-cycles in Hr are known as CR-edges and those introduced to break M-cycles are known 
as MR-edges. (R-edges=CR-edges+MR—edges). We denote an acyclic orientation of Hr by 
Hr. One can easily retrieve H/c from Hr where H/c is a directed acyclic matching induced 
reorientation of original Hasse graph of complex 1C, namely H/c- 

We now parsimoniously reduce the problem of finding an acyclic matching on H to that of 
finding an acyclic orientation on Hr: with the added condition that if e is an R-edge in Hr 
then e 6 Hr. 

Definition 22. We define the decision problem k-Partially-Oriented Problem or (k-POP) 
as follows: Given a digraph H(V,£), whose edge set is composed of R-edges and N-edges, i.e. 
£ = £r + £n, we optimize the number of N— edges, while ensuring that the gadget formed out of 
the selected edges remains acyclic. The corresponding optimization versions of the problem are 
referred to as min-POP and max-POP respectively. 

The objective function is a linear (possibly weighted) function of iV-edges. Essentially, in 
order to optimize in presence of constraints, we break all cycles in Hr by rejecting a set of 
IV-edges and iV-edges, s.t. the number of iV-edges rejected are minimized. 



6.2. The Construction 


Edge Duplication Hq —t II \ Given an undirected Hasse graph Hq construct a new graph II\ 
with same incidence relations as in Hq except that every undirected edge between nodes 
A and B in Hq becomes two directed edges: A —>• B and B —)• A in the new graph H\. 


E.g. graph A.O in Figure 6 —> graph C.l in Figure 7 


Edge-Pair Isolation & Cloning of Vertices H\ -» H 2 We isolate all A-A-edge pairs that 
have two vertices in common. In the new graph H\. for every vertex, say Vi we will have 
d{vi )in clones - one clone for each A-A-edge pair incident on Uj. This creates a graph 
H 2 with all edge-pairs disconnected from each other. While every vertex will have d{v{) 
clones of itself, every edge from H\ is uniquely represented as one of the edges in the edge 
pair H 2 which gives us a 1-1 correspondence with edges in H\. 

E.g. Graph A —> Graph B in Figure 9 Also, Graph A —> Graph B in |Figure B24 

Addition of CR-edges H 2 —> HqA We say that two A-edges are adjacent if there is an Pl¬ 
edge joining them. Now we treat each A-edge in H 2 as if it were a vertex in Hq and 
we join the two A-edges in HqA iff they are adjacent in IR- These adjacency edges are 
essentially R-edges that form the CR-edges. This construction is reminiscent of that of a 
line graph except that we are working with oriented graphs and even mo re specifically 
with A-edges as vertices and A-edges as adjacency arcs. See Note 6.2 on why cycles 
are necessarily "biparite" (i.e. restricted to a single level). 

E.g. Graph B.l —>- Graph B.2 in Figure 6 


Addition of MR-edges/FFT-edges H 2 —> HqB We have two ways of enforcing matching 
constraints using rigid edges: MR-edges and pseudo-FFT edges. The MR-gadget is 
conceputally simpler of the two, but can be potentially quadratic in the size of the input - 
an undesirable bottleneck. The pseudo-FFT gadget is sophisticated but has the advantage 
of being linear in size. Clearly, if two conflicting edges match, we have an RN-cycle of 
size 4 with two A-edges and two R-edges connecting them (where A-edges are part of 
the matching induced reorientation. This basic observation allows us to express matching 
constraints in form of cycle constraints. Surely, there is a (so-far) uninvestigated Model 
theory aspect involved here. 


MR-edges IR —> HqB.1 Method 1 For each pair of A-edges E t and Ej in /R which 
share a vertex and thus have a matching conflict, we form an R-edge that joins the 
top of Ei to the bottom of Ej and another R-edge that that top of Ej to bottom of 
Ei in graph Hqb.i- At the end of this we obtain Hqb.i- 
E.g. Graph A.l —> Graph A.2 in Figure 6 

pseudo-FFT edges IR —>• HqB.2 Method 2 Flere we take inspiration from the fast 
fourier transform (FFT). When one looks at the discrete Fourier transform, we see 
that each of the ouputs has several inputs. Naive interpretation leads to 0(A 2 ) 
complexity. But, a closer examination allows us to exploit the partial order structure 
by reusing redundant computation. Here too, we can enforce dominance relations 
using a special gadget that mimics the MR-gadget in linear space and time. 0 
E.g. Graph B —> Graph C in Figure 9 (and also Graph B —> Graph C in Figure B24) 


'While the idea for this gadget emerged accidentally while thinking about FFT, the analogy with FFT stays at 
the metaphorical level, as far as the author can tell. There doesn’t seem to be an obvious theoretical link 
between the two. Hence the prefix pseudo-. 














Note 6.2 (Property of matching-induced bipartite cycles). It is easy to see that in order 
to specify cycle constraints in graph H , it is enough to specify the set of N-edges in the cycle 
constraint. For instance, in Figure 6 to specify that A — B — C — D — E — F forms a cycle, it 
is enough to specify N-edges A-C-E as a combination of edges form a cycle. (The argument 
follows from the fact that since the orientation of the Hasse graph is matching induced, the 
edges B, D, E will necessarily be N_-edges. Needless to say that edges belong to the same level 
because of matching property.) This elementary observation motivates the construction of H 1 . 
Put differently, matching induced orientation allows us to specify cycle constraints merely in 
terms of N-edges. 

Definition 23 (MMFEP). min-Morse feedback edge problem: 

We are now in a position to define the MMFEP problem which is an extension of the MMUP in 
the following sense: The goal of MMFEP is to find a vector field that minimizes the number 
of critical cells over the set of all DGVFs that can be defined on a regular cell complex say K, 
along with certain additional prescriptions involving rigid/forbidden edges. 

Example 6.1 (MMFEP). Please refer to Figure 7 Part D. Graph D.l shows an MMFEP 

v \2 and U 13 —> vq specified as rigid edges, 
fio and vg —> U 13 are depicted as 


instance specified as a min-POP instance with U 10 
Graph D.2 shows a candidate solution. (Note that v \2 
rigid edges and they will necessarily be part of every solution including the candidate solution 


depicted in Graph D2 of Figure 7) 


6.3. The FLr gadget = Matching+Cycle Constraints 


Example 6.2. Consider the subgraph determined by vertices v\ V2 V3, U 4 , v§ v% V7 vs and 
vg. Here we depict the gadget formed by this subgraph that take into consideration both 
matching+cycle constraints. 

The Hr gadget models all matching constraints as well as cycle constraint of original Hasse 
graph H/c as strictly cycle constraints only albeit with introduction of the so-called rigid edges. 
Note that the Hr gadget can be described as follows: H 2 —>■ H 3 A + H 3 B.I. 

E.g. Graph C.l —> Graph C.2 in |Figure~7 constitutes an example of such a reduction. 

This gadget includes the MR-edges as well as the CR-edges. This gadget still suffers from the 
drawback that MR-edges may be quadratic in number for certain graphs. This situation can be 
rectified by using pseudo-FFT gadget in place of matching gadget i.e. H 3 B .2 instead of H 3 B.I. 
The pseudo-FFT gadget is described in |subsection 6.5] . 


6.4. Correctness and Complexity 


Note that all R-cycles are spurious. If an RE-formulation, say Hr, were to have an R-cycle then 
the formulation itself is incorrect, given the fact that all R-edges need to be included in Hr. 
Notation 24 (Isolated edges). The notation e,y represents an N edge, whereas eff is an N_ edge. 
Together and eff form an isolated edge-pair of normal edges. This edge pair is formed at the 
time of Hi —> H 2 transformation as described in Step 2 of the reduction procedure described 


in subsection 6.2 The head-node of eij is denoted as eP and the tail node of ejj is denoted as eP. 
Also, eij (in graphs H 2 , H 3 B , H 3 B) is a representative of the edge(s) between vertices Vi and Vj 


(in graphs Hq and Hi) prior to their isolation. For instance let i = 6 ,j = 1, k = 2 in Figure 6 


Then the edges A (between nodes v\ and vq) and B (between nodes vq and V 2 ) from Graph A.O 














(A.O) Original graph Ho 


VlJ) (lV2) 

(A.l) Duplicated graph ] 



‘ !pll| 

(A.2) Matching gadget 
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(B.l) Duplicated Graph Hi with cycles ABCDEF and FEDC Bj^ 
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((Byl |0|j) ((^ 3 )) C-Fl] 

(B.2) Cycle gadget 



“For sake of clarity, we depict H 1 restricted to edges incident on vg as opposed to showing the entire Hi 
b We do not depict edges incident on vertices V4 and V5 since we are interested in cycles with length > 3 


Figure 6: Matching Gadget A.O i-)- A.l i->- A.2 Cycle Gadget B.l 1 —> B.2 

















Figure 7: Cycle + Matching Gadget C.l i->- C.2 


MMFEP min-POP 

























and Graph A.l in Figure 6 become isolated edge pairs A = ei6, A = ei6 and B = e 26 , B = e 26 
respectively in Graph A.2 and Graph B.2 respectively. In Graph A.2 and Graph B.2, Aq = ef 6 , 


Ai — p B 
— e 16 > 


B = el 


and B 2 = ef 6 . 


3 6 — °26 

Lemma 25. T/ie rigid edge formulation for Morse Matching has no R-cycles. 


Proof. Consider the rigid edge formulation namely Hr. Now delete all the N-edges from this 
graph to obtain another graph say Hi. In graph Hi, the vertex set consists of e B and efj for 
every edge eij E Q- It is easy to see that, in graph Hi, all top nodes namely efj,ef k , ... for edges 
e iji e ik ■■■ € G are source nodes and all the bottom nodes namely e B ,ef k ,... are sink nodes. 
Since every node in Hi is either a source node or a sink node and since we have accounted 
for all the R-edges of graph Hr in graph Hi, we conclude that the RE-formulation for Morse 
Matching has no R-cycles. □ 

The two set of cycles i.e. (C-cycles and M-cycles) do not interfere, i.e. It can be easily seen 
that given an Orientaion of Hr, say 'Hr (that has cycles), if there exists an RN-cycle in 'Hr 
which is “composite" i.e. if this RN-cycle is composed of both the CR-edges as well as MR-edges, 
then there exists at least one M-cycle in 'Hr such that both the N-edges of this M-cycle belong 
to the mixed cycle. In other words, every composite cycle will include an entire M-cycle. 
Theorem 26. The rigid edge formulation ensures that there exists an RN-cycle in the rigid 
edge formulation that corresponds to a matching violation or to a cycle violation. 

Proof, (a.) Matching Violation =>- RN-cycle: 

Suppose there exists a matching violation between edges e ij and ejfc which correspond to N- 
edges e B —> efj and ef k —> ej k . Since we have the rigid edges efj —> e B k and ef k -» e B in our 
RE-formulation Hr, we have the following RN-cycle e B -» efj, efj — > ef k ,ef k -»• ef k , ef k -»• e B . 
(b.) Cycle Violation =>- RN-cycle: 

Note that if we have e^, ej k and e k i as edges in original graph Ho, then in our RE-formulation, we 
have CR-edge going from ef to e kj . Therefore, w.l.o.g., if we have edges A — B — C — D — E — F 
forming a path in Ho (where A, C and E are up-edges) then we have a corresponding RN-path 
in the RE-formulation since we have CR-edges going from A T —> C B and C T — > E B . Since 
every path in Ho has a corresponding path in the RE-formulation, we can say the same thing 
about cycles. 

(c.) RN-cycle =>■ A Matching Violation or a Cycle Violation: 

The essential idea is the following: If no consecutive pair of IV-edges in an RN-cycle share a 
vertex, then all edges together form a cycle and we have a cycle violation. Else if we do have a 
pair of up-N-edges within the RN-cycle that share a vertex then we have a matching violation. 
Every RN-cycle therefore represents either a cycle or a matching violation. 

In the RN-formulation, assume that the up-N-edge e ij is selected as part of the output. Edge 
eij is the sole incoming edge for the vertex efj whereas it will have more than one outgoing 
MR-edges each of the form efj —> ef k or efj —> e B -. Now, either some such adjacent edge or 
ekj is selected as part of the ouput or not selected. If we assume it is selected then there will 
be an RN-cycle of the form described in part a. of the proof which corresponds to a matching 
violation. Suppose we none of the adjacent edges of the form or ekj are selected as part of 
the output. Now, if we are to have a cycle involving e tJ then such a cycle must include the 
outgoing CR-edges from efj. As observed in part b. of the proof, such outgoing edges go from 
efj to e kl everytime we have e t j, e-jk, e-ki as adjacent edges in graph Ho- So if eki is selected as 
part of the output we have Vi, Vj,Vk ■ ■ ■ as a path in both Ho- We inductively argue the same 



way by observing the outgoing CR-edges and MR-edges of el). If an MR-edge is included as 
part of the cycle then a matching violation is assured and when we exclude the possibility of 
involving MR-edges each time, we extend the path using CR-edges only. These CR-edges mimic 
connectivity in Hq. Therefore when this path involving strictly CR-edges becomes a cycle in 
Tin, we can infer the existence of a corresponding cycle in Hq □ 


6.5. How to linearize: The Pseudo-FFT Gadget 


Definition 27 (Above & Below Degrees). Consider the original Hasse graph 'H/c(V,£) corre¬ 
sponding to complex 1C. Given a vertex Vk G V, where vertex Vk represents simplex € 1C, the 
above degree T>Ak °f v k equals the number of cofaces incident on simplex vf. in comple 1C. Also, 
the below degree T>Bk represents the dimension of the simplex vt in complex 1C 

This procedure of adding forbidden edges to take care of matching requirements that is 
mentioned above is disadvantageous because we need to add l T> ^k+' D Bk)(J 2 for each vertex v£ E V. 
For a simplicial complex T>B k = (d+ 1) where d is the dimension of the simplex corresponding to 
vertex vf in the Hasse graph. In applied contexts, for a simplicial/cubical complex, the maximum 
dimension is typically less than 4. So, T>B k is often bounded by a small constant. However, P 4 *. 
can easily be unbounded and can potentially be as large as 0(|V|). This would mean that in 
constructing the matching gadget, we may end up with quadratic number of forbidden edges 
(w.r.t. |£|) by adding W > Ak+' D Bk)(3 2 edges for each vertex v k . Worst case complexity of matching 
gadget is therefore quadratic. The complexity bound 0(T>^ k + Pg^) be. linear number of 
edges per vertex v k E V is achievable if we use the pseudo-FFT-gadget as a replacement for the 
matching gadget. 


Please see |Figure 9] and |Figure B24| for illustrative examples. |Figure 8 ] provides further 
intuition. Essentially, we replace rigid edges with rigid paths. Superficially, this seems like 
an additional layer of complexity. The point to note however is that unlike rigid edges, these 
rigid paths are not exclusive to the any two pair of vertices. If we look at Figure 8 what we 
really need to model is a domination relation from X\ to I 2 and from Y\ to X2. In matching 
gadget, we modeled these domination relations using the most elementary tool available to us: 
namely rigid edges. In case of pseudo-FFT, gadget we use non-exclusive rigid paths to model 
domination relations from X\ to I 2 and from Y\ to X-2 respectively. The non-exclusivity of 
these paths leads to a drastic reduction in complexity. 

Definition 28 (Truncated Binary Tree). We define a truncated binary tree to be any binary 
tree with the root node (and the edges incident on it) removed. 

Definition 29 (Node Levels in Truncated Binary Trees). Given a binary tree of depth 5, 
let the distances of a node a E from the root p be denoted as d(cr). We define the rank of 
a node in the tree as r(a) = d(a) + 1. Finally, we define the level of a node a in the binary 
tree as C{a ) = 5 + 1 — r(cr). Analogously, we may define the distance, the rank and the level of 
truncated binary trees by imagining a fictitious root vertex pj at the top of the truncated tree. 
Definition 30 (up-directed, down-directed Binary Trees). A binary tree with directed edges is 
known as a directed binary tree. An down-directed binary tree is a binary tree with directed 
edges from the root to its children on the next level and in general from any parent node (in the 
sense of a binary tree) to a child node on the next level such that the root is the source. Also, 
all leaf nodes are sinks of directed paths that start from the root node. An up-directed binary 
tree is a tree that is similar to a down-directed binary tree, except for the fact that all its edge 









Algorithm 1 Pseudo-FFT Construction 

Procedure createNodes (M, N) 

Input: Set of arcs M 

Output: logn levels of hierarchically constructed ‘from-nodes’ F and ‘to-nodes’ T 
1: ©> { The 1 st level From-nodes F\ are the same as . } 

2: ©> { The 1 st level To-nodes 7,- 1 are the same as . } 

3: Assign labels [i+] to the from-nodes F\. Assign labels [*_] to the to-nodes 7 .- 1 . 

4: Subsequently, W s.t. [~iV~|/2^ > 1 create \N~\/2 e nodes for h th levels F i and T l . 

5: return Me (he. the number of levels created including the 1 st level). 

6 : ©> { Label assignment for levels F l and F ( for levels i > 1 is deferred. } 


Procedure T^edges (T p ,T q ,M) 

Input: To-nodes of levels T p and T q 

Output: Arcs joining nodes of T p to-nodes of F q . Labels for nodes of level T p . 
Assign i <— 1 ,j <— 1. 


if j < M — 1 then 

Create arcs 7~? —> T- 
Assign label 2 z?( 7j) 


& -> 7 ? +1 . 


Assign label G— \M{T q ) N M(T q +l )lj ^j + 2. 

©> { The operator N z/J appends string % to string z/.} 

else 

Create arcs 7f -> T q \ Assign label JF(7f) <— [[^f(T) 9 )-]; j 
i i — * + 1 . 

while j < M 


3 + 1. 


Procedure ^edges(J rp , J 79 , T r , M) 

Input: from-nodes of levels F p ,F q and to-nodes of F q 

Output: Arcs joining nodes of F p to nodes of F q Sz T'. Labels for nodes of level F q ). 
Assign i <— 1 ,j <— 1. 


if i < M — 1 then 

Create arcs F? F q F p F q , Ff 


Assign label JF(F q ) 






else 

Create arcs Ff 

3 <— 3 + 1- 

while i < M 


F q \ Assign label JF(F q ) 


i + 2 . 




i + 1 . 


Procedure pseudo-FFT (M,N) 

Input: ^ is a set of N-edge pairs with cardinality \M\ = N 
Output: The pseudo-FFT gadget that linearizes matching constraints 
Me <— createNodes (M,N) 

W G [l,A/)c - l],T £ edges(7* 7*" 1 , | T l ~ l \) 

W G [l,Me ~ l\,F e edges(F e ,F i+ 1 ,T e , \F £ \) 

Create arcs F? c ->• T^ c & Ff c -> 









orientations are inverted. In this case, the root node is the unique sink of all directed paths and 
all leaf nodes are sources. 

Truncated up-directed (and truncated down-directed) binary trees are up-directed 
(and respectively down-directed trees) with corresponding root nodes removed. 

Definition 31 (From-nodes and To-nodes). Given a set of paired normal edges {^}, all nodes 
in the truncated up-directed tree with {^ T } as their leaf nodes are known as from-nodes 
whereas all nodes in the truncated down-directed tree with as their leaf nodes are known 

as to-nodes. We refer to the corresponding trees as the up-directed from-tree and the 
down-directed to-tree respectively. 

Notation 32 (Number of Levels Me). Define Me = total number of levels of the truncated 
from-tree = total number of levels of truncated to-tree. 

Definition 33 (Label Inheritance Property). We say that node v has s-label inheritance if 
there exists a node u s.t. u —> v => Jz?(u) C Jz?(z/). Analogously, we say that node u has 
d-label inheritance property if there exists a node v s.t. u —» v ==> S£{v) C Jzf(u). 
Definition 34 (Label Merging). We say that two labels Jzffc) and _§?(j/) merge at a from-level 
i if (?c) n JS? (y) = 0 and if there exists a label At ?(^) U (y) at level i + 1. A similar definition 
holds for labels at to-levels j and j + 1. 

Definition 35 (Label Direct Inheritance). If there exists from-labels jSf(^) and «£?(z/) at levels 
i and i + 1 respectively such that A f(^) = Jz?(z/) then we say that y has directly inherited 
(i.e. without merging) its label from A similar notion of direct inheritance holds for nodes of 
to-levels j and j + 1. 

Definition 36 (Binary Complements). Binary complements are defined as follows: 

1. For level p s.t. 1 < p < Me — 1, if we have three from-nodes say u = J-? , v = Fj 

and iv = F^. +1 s.t. u —> %v and v — > iv then u and v are known as binary from- 

complements. In this case, JF(u) -hAf(v) = Jzf(zv). 

2. The ultimate from-level C has at most two from-nodes. If the are precisely two in number, 
then they are considered to be binary from-complements. 

3. Similarly, for level p s.t. 1 < p < Me — 1, if we have three to-nodes ?c = Tf, y = Tj +1 and 

z = 7)f +1 s.t. t y and ^ —» z then y and z are known as binary to-complements. 

Here we have JF(y) +Jf(z) = -$?(%.)■ 






A. Hasse graph Hi with duplicated edges 


o o P) 8 8 B 8 8 P) 


B. Hasse graph H 2 after edge pair isolation and cloning of vertices 



C. Hasse graph H 3 B .2 on addition of pseudo-FFT rigid edges 
suffix-nodes are from-nodes whereas suffix-nodes are to-nodes 


Figure 9: The pseudo-FFT gadget for vertex V 2 (labelled ’1’ in this figure) 

















4- The ultimate to-level C has at most two from-nodes. If the are precisely two in number, 
they are considered to be binary to-complements. 

5. If FCCf and Fj make binary form-complements andTCCf andTj are respective binary 
to-complements at level p, then the pair of nodes {Ff ,Tj} (as well as the pair {if, Ff}) 
form mirror-binary complements. 

If the number of nodes at a particular level are odd, then all nodes except the last node of 
that level has binary complements. 

Definition 37 (Label Domination). If we have an edge ?c —> y between two vertices ?c and y 
then we say that % dominates y. Also, the relation of domination is transitive, i.e. if we have 
—> y and y —>• z then we say that ^ dominates z. 

Therefore this gadget introduces three types of edges % —> y, namely: 

1. the from-node edges ^ —>• t/ s.t. C 2z?(z/) (via s-label inheritance) 

2. the to-node edges ?c —> y s.t. 2z?( y ) C Jffc) (via d-label inheritance) and 

3. the from-to edges y s.t the from-node % dominates its mirror complement to-node y. 
Lemma 38. All from-nodes of the FFT gadget have s-label inheritance whereas all to-nodes 
have d-label inheritance. 

Proof. For the proof of this lemma we would referring to the construction as delineated in 
Algorithm 2. We begin with observing Procedure T^edges (.). Let u = Tf,v = 7) 9 and 
zv = T? +1 Line Hof Procedure 7"^edges(.) creates edges u -> v and u —>• zit whereas Line [ 5 ] 
of Procedure 7nedges(.) is equivalent to saying _S?(z/) C JF(u) and JF{zv) C «5f(u). For the 
alternative case on Line [8j we have u —> V followed by T£(v) C jSf(u). Since all to-nodes labels 
are created in Procedure 7"^edges(.), we conclude from the two cases above that to-nodes have 
the d-label inheritance property. Now we turn our attention to Procedure iF^edges(.). Let 
t = F\, u = 7\fij, v = Fj. Line|4 of Procedure iF £ edges(.) creates edges t —> v and u —» v. 
Moreover, Line assigns label to V in a manner such that 2z?(t) C and «£f(u) C J£(v). 

Finally, for the alternative case in Line[7j t —>• V followed by 2z?(t) C T£(v). All from-node labels 
are created in Procedure iF^edges(.). Hence, from-nodes have s-label inheritance property. □ 

Lemma 39 (Existence and Uniqueness of Labels for Every From-Level). Suppose we are given 
a set of N-edge pairs sF with cardinality \srf\ = N. Then, Vi G [1... N], the labels 2F(Ff) where 
n=^ are s.t. there exists a set of indices {i p } that satisfy the Label Property namely: 

2?(F()CJ?(F? p ) V 2 < p < J\fc 

Moreover, such an index i p that satisfies the label property for a given label J£(F}) is unique for 
every level p. 

Proof. To begin with, note that, the subroutine .hedges is called from Line [ 3 ] of subroutine 
pseudo-FFT . For each £, edges joining nodes of F f to nodes of F e+1 are formed. Denote the 
total number of nodes of level F L (namely | F^ [) by M. Also, let W = ( M_1 )/2 +1. In the while 
loop of Lines [2||9] of subroutine F^e dges, if M is even then, Vi, s.t. 1 < i < M / 2 , we merge 
labels of pairs of nodes ?Q+i) of level F' by creating edges —> yj and ^+1 —> yj between 
such a pair with a unique vertex yj of level F (+l . For each level i, there will be M /2 such pairs of 
level F /: that form edges with M /2 nodes of level F^ +l . Else if the number M = \F^\ is odd, then 
we merge labels of pairs of nodes of level F ( namely (^, ^j+i) by creating edges between 

each pair of nodes of F^ with a unique node z/ ? of level F^ +l (as before) whereas the final node 



of P t+1 namely yw direct inherits its label from final node ?Cw of level P e via edge Km —> yw- 
In both cases, whether even or odd, for every node K. in -F £ , we have exactly one node y in level 
P i+1 such that K —> y. This gives us per level uniqueness property. Now, since the subroutine 
.Fledges is invoked for every consecutive pair of levels (£, £ + 1) within subroutine pseudo-FFT 
, we conclude that by: (a) transitivity of dominance relations formed by directed edges across 
nodes of successive levels (b) and owing to per level uniqueness property as discussed above, 
the label property as described in the statement of the theorem is satisfied. □ 

Lemma 40 (Existence and Uniqueness of Labels for Every To-Level). Suppose we are given a 
set of N-edge pairs sP with cardinality \sP\ = N. Then, Vj 6 [1 ... N\, the labels Jz?(71 1 ) where 
Tj = sP^ are S 't' there exists a set of indices \ j p } that satisfy the Label Property namely: 

^(7/) c jz?( 7|) V 2 < p < Me 


Moreover, such an index j p that satisfies the label property for a given label J£(Tj-) is unique for 
every level p. 


Proof. The subroutine T^edges is invoked in Line [ 2 ] of subroutine pseudo-FFT . For each £, 
edges joining nodes of P f to nodes of T^ 1 are formed. Denote the total number of nodes of level 
P £ (namely \£F e '\) by M. Also, let W = (M-iy 2 + 1. In the while loop of Lines[2 TO of subroutine 
T^edges, if M is even then, Vi, s.t. 1 < i < M / 2 , we merge labels of pairs of nodes (?&, Ki+i) of 
level T^” 1 by creating edges yj —>• and yj —> Ki+\ between such a pair with a unique vertex yj 
of level T . For each level £, we have M /2 such pair of nodes from level T^ -1 that form edges 
with m /2 nodes of level P £ . Else if the number M = |T f_1 | is odd, then we merge labels of 
(M—1)/ 2 pairs of nodes of level T^ 1 namely (^, Ki+i) by creating edges yj —> Ki and y 3 —> Ki+i 
between each pair of nodes of T^ 1 with a unique node y 3 of level P £ (as before) whereas the 
final node of P^ 1 namely yw direct inherits its label from final node 7(m of level P f via the 
edge yw KM- In both cases, whether even or odd, for every node k. in 7"^ _1 , we have exactly 
one node y in level P £ such that y —> y . This g ives us per level uniqueness property. Now, 
(symmetric to the argument made in Theorem 39) since the subroutine T^edges is invoked for 
every consecutive pair of levels (£, £ + 1) within subroutine pseudo-FFT , we conclude that 
by: (a) transitivity of dominance relations formed by directed edges across nodes of successive 
levels (b) and owing to per level uniqueness property as described above, the label property as 
described in the statement of the theorem is satisfied □ 


Theorem 41. Given a set of edges sP we can ensure that every node of the type sP? dominates 
every node of the type sP^ (fo r i j) using rigid edges that are part of the pseudo-FFT-gadget. 


Proof. Consider two edges sPifsP?',JP t B ) and sPji^sPj', sP B ). We will show that there exists a 
directed path from sP B to sP B and another directed path from sPj to sP B . Let pj = sPj 
and p\ = sP7 ■ Also, let T 1 = sP B and 7) 1 = sP B . We know from Theorem 39| that the label 


property will be satisfied for all subsequent from-levels 2 < p < Ale- Let P\ be such that 
££{p\) C T£{P ! ^) and Jy? be such that JP(pj) C Jz?(jT ) We consider two cases here: 

Case 1: The labels J 7 ? and jT are binary complements for some from-level 1 < q < Me — 1. 


In this case, from Line [ 4 ] of subroutine Fledges we concur that edges are formed between 
the mirror-binary complements J 7 ® —> P? and pj —> Pfd However, using s-label inheritance 
property along with label property for from-nodes, we know that = sP B dominates p? (also, 






J-j = si? dominates J-J ) and by using d-label inheritance property along with label property 
for to-nodes, we know that T? dominates T} = si? (also, T? dominates T 1 = si- B ). Thus for 

Jq J J v l q L L / 

this particular case the statement of the above theorem holds. 


Case 2: Now consider the case when and J-9 are not binary complements for any of 
the from-levels 1 < q < Me — 1 . Since there are only two from-nodes for level L = Me, one 
of the nodes is J-? L and the other is T? L . By definition, the two nodes of the final level are 
always binary complements and from Line [4] of subroutine pseudo-FFT we have edges between 
mirror-binary complements. Once again, (as before in Case 1) by applying s-level inheritance 
property and label property for from-nodes for J-? and d-level inheritance property for 7 ~? 
along with label property for to-nodes, we conclude that the statement of the theorem above 
holds for final level mirror binary complements 

The two cases exhaust all possibilities (i.e. two given labels si? and si? have labels J-f and 
1? containing them that form binary complements at exactly one level 1 < q < Me and at that 
particular level we have edges joining mirror binary complements which ensure that domination 
holds through s-label and d-label inheritance properties.) Hence proved. □ 

Notation 42. We use the notation a b to indicate a path joining nodes a and b and the 
notation a —> b to indicate an edge joining nodes a and b. 

Theorem 43. The pseudo-FFT gadget and the matching gadget are equivalent. 


si?, si? 


si? and rigid 


si? respectively for the 


si? and si? 

B —> si? and si B —> sij will therefore 


Proof. Part I Matching =>■ pseudo-FFT 

Suppose there exists a rigid cycle involving normal edges si? 
edges belonging to matching gadgets si? si? and si? —> si?. This would mean that there 
is a matching conflict involving edges sii and sij. Using Theorem 41, in turn, implies that 
there will be corresponding rigid paths from si? ~- 
pseudo-FFT gadget. The presence of normal edges 
result into a cycle for the pseudo-FFT gadget (just as it would if we were to use the ematching 
gadget instead). 

Part II pseudo-FFT => Matching 

To begin with, observe that rigid edges belonging to the pseudo-FFT gadget either join (a) a 
lower level from-node to a higher-level from-node (b) or a higher level to-node to a lower level 
to-node (c) or a from-node to a to-node of the same level, thus maintaining a hierarchy. The 
from-nodes and to-nodes of level p such that 2 < p < Me do not have any normal edges through 
them. Consider an RN-cycle *€ involving pseudo-FFT edges of the form si B —> si B , si j 7 si B , 
si B —> si B , ..., si B —> si B , si B si B . Suppose that the FFT-rigid path si? si B is part 
of RN-cycle . Then the corresponding rigid path si? si B is also a part of the FFT-gadget 
(These rigid paths mirror the rigid edges si? —> si B and si? —>• si B of the matching gadget.). 
Also, if si? si B is part of the RN-cycle then clearly IV-edges si B —> si? and si B —> si? 
are also a part of the RN-cycle . Now, this ensures that a matching conflict between edges 
si B —> si? and si B —> si? occurs because of presence of the sequence of edges and paths: 

^ si B , si B -4 si B , and si? -w si B . So if there is a cycle violation in 
pseudo-FFT gadget there will be at least one cycle violation in the corresponding matching 
gadget. In fact, in the above case, every pair sii, sij where i,j < n and si? si B is an 
FFT-rigid path results into an RN-cycle indicative of a matching conflict. Hence proved. □ 



Theorem 44. Suppose that we replace matching gadgets by the pseudo-FFT gadgets for matching 
conflicts at every vertex in the Hasse graph, then the rigid edge formulation ensures that there 
exists an RN- cycle in the rigid edge formulation that corresponds to a matching violation or to 
a cycle violation. 


Proof. The proof of this theorem mirrors the proof of Theorem 26 if we replace every rigid edge 
$tf T —> SS B of the matching gadget with rigid path srf T PS B of the pseudo-FFT-gadget. □ 


6.6. Objective: From Gradient Pairs to Critical Cells 

To begin with, we now need to translate the count of number of non-matching edges incident 
per vertex to the count of critical cells. Let C{vi ) be a function which is 0 when Vi is matched 
and 1 when it is unmatched. Let Si be the set of IV-edges incident on vertex Vi^PL where PL 
is the original Hasse graph. Consider the term: T{vf) = eRf — ( d(vi ) — 1). By definition, 

Gij Go i 

the maximum number of incident IV-edges \Si\ = d{yf). If all of them are unmatched then 
\Si\ = d (vi) whereas if one of them is unmatched then \£f\ = d (v{) — 1. If Vi is matched with 
some vertex Vj. then by definition, jV-edge efj = 0 whereas efj = 1 for all j such that V{, Vj are 
not matched. So, when Vi is matched we have, T{vf) = (d (vf) — 1 ) — (d (vf) — 1 ) = 0 , whereas 
when Vi is unmatched we have, T{vf) = ( d{vi )) — ( d(vi ) — 1) = 1. Let C(vi) be an indicator 
variable for criticality of vertex V{. Therefore, C{vf) = T(yi). 

We formulate the objective thus: 


r (vi) = E e o _ ( d ( Vi ) _ x ) 




T(n) = 


ei > - - !) 


V , 


eSi 


= E c ( 

Vi eV 


( 6 ) 


Recall that, the reason MMUP approximates the number of criticalities in terms of aggregate 
Morse number whereas MMMP approximates number of gradient pairs (Morse matchings) in 
terms optimal number of gradient pairs. As per the SMOC condition described in , if we were 
to assume that the optimal aggregate Morse number closely models the aggregate Betti number 
then, the approximation of MMUP is equivalently an approximation of the Betti numbers. Thus 
under SMOC, MMUP as a problem-model offers significant advantage over MMMP since it 
allows us to approximate the aggregate Betti numbers using a significantly smaller complex. 
Theorem 45. There is an approximation preserving reduction from MMUP/MMFEP to min- 
POP. 


Proof. The proof is relatively straightforward. From Equation 6 along with Theorem 26 (for 
min-POP using cycle+matching gadgets) or Theorem 44 (for cycle+pseudo-FFT gadgets), we 
conclude that the values of the objective functions itself are, in fact, equal for MMUP and the 
corresponding min-POP problem. This means that Condition 2 of approximation preserving 
reductions as specified in Equation 2 is satisfied. Since objective values are equal for all candidate 
solutions, they are also equal for the optimal solution. Therefore, Condition 1 of approximation 
preserving reductions as specified in Equation 1 is satisfied. The reduction procedures are clearly 















polynomial time ( quadratic if cycle+matching gadgets are used and linear if cycle+pseudo-FFT 
gadgets are used). The procedure to construct the solution for MMUP given a solution for 
min-POP is also linear. All that is required to obtain the MMUP solution is to count the 
number of N from the min-POP solution. Once we have a count of N_ edges, from [Equation 6] 
we get the number of criticalities (unmatched nodes) and we are done. Identical reasoning holds 
for MMFEP to min-POP reduction. □ 

Theorem 46. The approximation preserving reduction from MMUP/MMFEP to min-POP is 
linear in space and time. 

Proof. Note that during the edge isolation procedure the number of vertices increase. Specifically, 
we start with 2 X |£| vertices as opposed to starting with |V| vertices. The pseudo-FFT gadget 
adds linear number of nodes and edges which eventually makes the entire graph reduction linear. 
The argument goes as follows: 

• For every vertex v in the Hasse graph, we need to construct a pseudo-FFT gadget. Level 1 
has T> v nodes (where T> v is the sum of above degree T>_a v and below degree Pbv °f vertex 
v in the original Hasse graph). Level k has [~2+/2 fc ~| new nodes. 

• The total number of levels are bounded by log XV 

• From a simple power series calculation we can deduce that the total number of newly 
introduced from-nodes (from label merging) in the pseudo-FFT gadget for some vertex v 
is upper bounded by XV Also, since there are log T> v number of from-levels, the number 
of from-nodes created via direct label inheritance are upper bounded by log XV 

• Similar counting argument can be made for total number of to-nodes. Hence, there are in 
all 2 x (fD v + log T > v ) number of newly introduced nodes. 

• Total number of nodes (original + newly introduced) will be 3 x T> v + 2 X log T> v . 

• Each node in pseudo-FFT gadget has at most two outgoing rigid edges. Therefore total 
number of rigid edges introduced per vertex v in the original Hasse graph will be bounded 
by 6 x T> v + 4 x log XV For a sufficiently large XV this estimate is upper bounded by 
7 x V v 

• Each edge is counted once as part of above degree of some vertex and once as the below 
degree. When summed across all vertices in the Hasse graph, we get an upper bound of 
14 x £ on the number of newly introduced, taking into account every node in the Hasse 
graph. 

• The cycle gadget introduces |£| new rigid edges. Therefore, number of rigid edges introduced 
(cycle+pseudo-FFT) will be upper bounded by 15 X £ = 0{£). 

• Also, we assume that the dimension of the input simplicial complex is a small constant. 
So, we have 0(\£\) = 0(|V|). So, total number of nodes and edges added to the graph is 
bouned by 0(V). 

Since the number of new nodes and edges introduced are linear, the space and time complexity 
of the reduction procedure is also linear. □ 



gadget that achieves the effect of forbidden edges for matching using significantly fewer edges. It 
should be noted that every edge depicted in examples of pseudo-FFT gadget is a rigid edge. The 

















nodes in red, blue, purple and green are the newly introduced nodes whereas peach color nodes 
are introduced at the time of edge isolation. It can be easily verified from the construction that 
the number of newly introduced nodes that are specific to the pseudo-FFT gadget is always 
smaller than 2 x (P_+ T>b v + (P_+ T>b v )) by a simple power series calculation (whereas 
nodes created at the time of edge isolation are bounded by (P_ 4 „ + T>b v )). Therefore, total 
number of nodes in our examples are asymptotically bounded by 4 X (P_+ T>Bv)- All the the 
peach nodes on the left, all the red, blue and green nodes have two going rigid edges going 
out from them, all the purple nodes have one outgoing rigid edge whereas all peach nodes on 
the right have none. So, total number of new rigid edges introduced (for all 4 X ('Dav 1 + Pg„ i) 
nodes counted once) is asymptotically upper bounded by 8 x (P v 4 „ 1 +Pg„ 1 ). Therefore, if we 
apply this bound on all vertices of the original Hasse graph and then sum over all the vertices, 
we get 8 x (|£| + |V| X |Pg maa; |) edges (when summed over pseudo-FFT gadget of every vertex 
in the orginal Hasse graph). We consider the number of edges to be linear in number of vertices 
and the dimension of the simplicial complex to be a small constant. Therefore, under these 
assumptions, we achieve linearity. 

In the example depicted in |Figure B24 we see the pseudo-FFT gadget for a vertex v\ which has 
degree (T>Av\ + Pg Wl ) = 16 that happens to be a power or 2. This example happens to be quite 
symmmetric since all newly formed vertices obtain their labels through merging. 

In contrast, in the exampe depicted in Figure 9, we see a gadget for a vertex V 2 which has degree 
(P, 4 „ 2 + Pe„ 2 ) = 9 which isn’t a power of 2. Hence, we see a mix of merging and direct label 
inheritance. 


7 . Lower Bound on min-POP 

Theorem 47. It is NP-hard to approximate min-POP with a factor better than 0(log(n)) 

Proof. The proof follows via an approximation preserving reduction from min-Set-Cover to 
min-POP. The construction and proof are relatively straightforward. The curious connection it 
establishes between two classical problems is surprising. Construction and proof deferred. □ 


8. SDP Formulation for Directed c-Balanced Separator 


From Theorem 46 we conclude that, in order to design an efficient algorithm for MMUP we 
need to design an efficient algorithm for min-POP (which is essentially min-FAS with rigid 
edges). To solve min-FAS with rigid edges we have two strategies: LP-based approximation and 
SDP-based approximation. In this work, we focus on the SDP based solution. 

Definition 48 (Directed Semimetric on a Graph Q). Let Q — (V,£) be a directed graph. A 
directed semimetric on a graph Q satisfies the following conditions: 


1. Vi S V,V(vi,Vi) = 0. 

2. Vuj, Vj, v k S V, V(vi, Vj) + V(vj,v k ) > T>(vi, v k ) 

Agarwal et al. |H] use the directed semimetric P(uj, Vj ) = \vi — Vj\ 2 — |uo — vf\ 2 + |wo — Vj | 2 
for a fixed reference vector vq and vectors Vi,Vj s,t, i,j ^4 0 etc. are in 1-1 correspondence with 







vertices Vi,Vj E fl Also, the quantities T>(vi,Vj) are in 1-1 correspondence with the edges 
Vi —> Vj E £ . 

Definition 49 (minimum directed c-balanced separator). Let Q — (V,£) be a directed graph. 
Then, the directed edge expansion of a cut (S,S) is — S (|g^g|> • ^ cu ^ (S>S) that satisfies 

|S| > c|V| and |5| > c|V| is a c-balanced cut. A c-balanced cut with minimum, directed edge 
expansion is a minimum directed c-balanced separator. Here, 5 out {S ) denotes the set of outgoing 
edges from S. 


Recall that min-POP requires dealing with pre-specihed edge orientations i.e. dealing with 
rigid edges and forbidden edges respectively. We achieve this effect with a very simple strategy 
that we shall outline below: Let 7Li(V,£) be the edge duplicated Hasse digraph from the 
construction described in subsection 6.2 Let (ui,Vi) ie ti...jry £ C be the set of forbidden 
edges and (vi,Ui)i£\...n E £\ C £* be the set of rigid edges where £+ is the extended set of 
edges of the graph 7L* = (V*,£*) obtained at the end of the reduction procedure. Note that 
£j\f U £f U £q = £* and (|£q| =71) = (\£q>\ = IF). Note that the set of normal edges £\r is in 
1-1 correspondence with the edges £ of the Hasse graph. Assume that the condition since 
V(ui,Vi) + T>{yi,Ui) = 1 has already been enforced for every T>(-, •). Therefore, we need to 
implement either V(ui,Vi ) = 0 (enforcing a forbidden edge) or T>{vi,Ui) = 1 (enforcing a rigid 
edge) noting that the conditions are equivalent. When using a divide-&-conquer procedure with 
min-directed c-balanced cut as a subroutine, it is easier to implement T>(ui,Vi ) = 0 (i.e. to 
specify that we do not cut edge (ui, vf) at this (or any) recursion level) rather than implementing 
T>(vi,Ui) = 1 (i.e. to specify that we cut the rigid edge ( Vi,Ui ) at this recursion level). Ensuring 
T>(ui,Vi) = 0 at each recursion level ensures we have T>(vi , uf) = 1 at the end of divide-&-conquer 
recursion procedure. 

Definition 50 (Orthogonality Constraint for edge (u,v)). We define orthogonality constraint 
for edge (u,v) as follows: 

V{u , v) = |uo — v\ 2 — |uo — u\ 2 + \v — u\ 2 = 0 
which is equivalent to: |uq — v\ 


,o i i2 i i2 

" + |u — u\ = Vo — u\ 


Ill this case, we will proceed with the the most naive strategy, i.e. We shall use the directed 
metric from Agarwal et.al. and to equate it to 0 for all forbidden edges. The vector programming 
formulation for min-POP is basically an extension of Agarwal et.al.’s |5j formulation for minimum 
Directed c-Balanced Separator Problem (Conditions I, II, III) with an extra IV th condition for 
handling the forbidden edges. 

Vector programming formulatior |^] for (Divide & Conquer Subroutine of) min-POP: 

8 We use the symbol Vi to denote a vector that represents vertex Vi £ Q. Whether we mean to represent 
the vertex or whether we mean to discuss the vector that represents it, each time we use the symbol Vi is 
sufficiently clear from the context. 

9 Once the vector programming formulation is written down it is merely a triviality to write down the corre¬ 
sponding SDP formulation. See [87p1 . Going forward we shall use the terms vector program and SDP 
interchangeably unless otherwise specified. The meaning is of course extremely obvious from the context.. 







(min-POPs) min- T>{vi,Vj ) 

({ vi,vj)e£ 


(7) 

s.t. 



E T>(ui,Vi) = 0 


(8) 

{u i,vi)eT 



\vi — Vj| 2 > 4c(l — c)n 2 


(9) 

i<j 



\Vi — Vj\ + \Vj - Vk\ >\Vk~Vi \ 

Vvi,Vj,v k G V U {u 0 } 

(10) 

\vi\ 2 = 1 

Vuj G V U {u 0 } 

(11) 


Consider the SDP formulation described in (min-POPs)| Note that, if we were to remove 
the Equation 9 this would actually be an SDP relaxation for min-POP (See |Theorem 5l ). 
However, we would not know of a rounding procedure to such a relaxation. Now, |Equation 9| 
ensures that if a cut based rounding is used then the cut will be a c-balanced cut (a necessity 


for Leighton-Rato divide-&-conquer approximation schemes). Hence we must add Equation 9 


so that we can use a balanced-cut based rounding procedure that gets applied as part of a 
hierarchical divide-&-conquer scheme. 


This SDP is indeed a relaxation since every assignment of boolean variables corresponds to a 
feasible set of vectors: 


• Vi = vq, if Xi = 1. 

• Vi = -vo, if Xi = 0. 

We now discuss the equations in the formulation for the min-POP subroutine. 


[Equation 11 ensures that all vertices lie on the unit sphere. 

[Equation 10 is the familiar triangle inequality. 

[Equation 8 is the forbidden edges condition. It is the new condition that we add to the 
formulation (to incorporate rigid edges). It is essentially the sum of all the forbidden edge 


constraints described by Theorem 50 Note that since all the T>(ui,Vi) are non-negative, 
equating their sum to 0 essentially means that all of them have to be 0. 

Agarwal et.al. j5] give a §—balanced cut that approximates the minimum directed c-balanced 
separator within a factor of 0(y/log(n)). Their work is based on the the work of Arora et 
al. |9|, mor e specifically on the powerful ARV Structure Theorem, that is briefly stated in 


Theorem 53 


m 


section 9 


We modify the Agarwal et al.’s algorithm to provide an 0(y / log(n)) 
factor approximation for every subroutine of the divide-&-conquer approximation algorithm for 
min-POP. 


Now, we consider a formulation for feedback-arc-set on some graph Q(V,£) by Grotschel, 
Jiinger and Reinelt in [42]. This special formulation specifies the acyclicity constraints for given 
edge set £ in form of ordering constraints on the complete graph with vertex set V. They show 
that destroying cycles that are specific to edge set £ is equivalent to destroying all length 2 and 
length 3 cycles in the complete graph. The information about the edge set £ is accommodated 
only in the objective function as shown in the formulation |(min-FAS(GJR)) 






























(min-FAS(GJR)) min d t j 

(12) 

e i,j(z£ 


s.t. 


dij > 0 

(13) 

dfj + dji — 1 

(14) 

dij H - djk ^ dik 

(15) 


Theorem 51. The formulation described in (min-POPs) is a correct formulation for a subroutine 


within the SDP-based Divide & Conquer Algorithm for min-POP 

Proof. To begin with we check if the directed semi-metric T>(-, ■) in SDP formulation min-POP 
satisfies the conditions prescribed for d\j (•) in the LP formulation (min-FAS(GJR))| 

• Since T>(u, v ) = |uo — v\ 2 — |uo — u\ 2 + \v — u\ 2 = 0 is sum of positive real numbers, clearly, 
V{u, v) > 0. 

• Secondly, note that, by simple application of definition, T>(u,v) + T>(y,w) — T>(u,w) = 
\v — u\ + \w — v\ — | w — it| . However since we work with an Zg metric in the ARV 
algorithm, we have |u — u\ 2 + \ w — v\ 2 > \w — u\ 2 which gives us D(u,v) + T>(v,w) > 
V{u, w ). 

• Finally, the rounding algorithm of Agarwal et al. employs a directed cut hierarchically 
until every directed edge (or its counterpart) is cut at some recursion level. Therefore, 
clearly we satisfy D(u, v ) + D(v, u) = 1 for every directed edge T>(v, w) on the complete 
graph. 

It follows from the Grotschel-Jiinger-Reinelt formulation of min-feedback arc set given in the 
LP formulation (min-FAS(GJR))| that the graph obtained at the end of the Divide-&-Conquer 
based SDP algorithm will not have any cycles. Also, we have the forbidden edge constraint 
y) T>(ui,Vi) = O. 10 Now, we have already proved in Theorem 26 and Theorem 26 that if 

{ui,vi)GT _ _ _ 

(a) the rigid edge formulation does not have any cycles (b) and if the rigid edges are preserved, 
then a corresponding orientation of the original Hasse graph will not have any matching or 
cycle constraint violations. Hence, if we were to merely consider [Equation 8 [Equation 10| and 
Equation 11 then we would get an SDP formulation for min-POP. But, since (for purpose of 
appropriate rounding), we need to approximate a c-balanced cut at each recursion level, we 
need Equation 9| in addition to the above three equations in order to formulate an SDP meant 
for the min-POP subroutine. □ 


“In 


subsection 9.2 Theorem 60 we prove that the Agarwal et al. rounding procedure preserves the rigid edges. 


In other words, the rounding procedure ensures that none of the forbidden edges are cut. 





























9. An 0(log 3 / 2 (A/*)) factor Approximation Algorithm based on 
ARV-rounding 

9.1. Adding the Orthogonality Constraint to the ARV formulation 

9.1.1. Sparsest Cut and the ARV Structure Theorem 


Sparsest Cut 

min ^2 Cij | Vi — Vj \ \ 2 

(16) 


i,j£E 



s.t. 



r—1 

II 

CN 

1 

W 

(17) 






(18) 


Ui — Vj | 2 = I7(n 2 ) 

(19) 





The original ARV algorithm was designed for the sparsest cut problem that is specified in 
Sparsest Cut The ARV algorithm crucially depends on the ARV structure theorem to prove 


the approximability result is stated below. See |Theorem 53[ 

Definition 52 metric space), representation of a graph G is an assignment of a point 
vector to each node Vi £ M. k for each i. s.t. 


| Vi - Vj \l + I Vj - v k \l > I Vi - v k \l 
It is a unit Z 2 representation if on unit sphere i.e. |nj| = 1, Vi. 

Theorem 53 (ARV Structure Theorem). For a set of points v\,V 2 , ■ ■ - v n if the following 
conditions are satisfied: 

1. All points lie on the unit ball in M n . 

2. The points form an l\ metric w.r.t. dij where dij = ||uj — v j\\ 2 > 

3. All points are well-separated i.e. X)?: j djj/n 2 > 5 = 17(1). 

Then 3S,T disjoint subsets of V s.t. |5|, |T| > i?(n) which satisfy the following property: 


min i& s,j&T<iij > 12(1 /yfiogn) 


Algorithm 2 we handle the rigid edges 


Note 9.1 (Fat hyperplanes and rigid edges). In 

in Lines [7d||1.9[ This variation to the ARV algorithm is justified because the ARV structure 
theorem guarantees us that S and T which are of size ft(n) are well-separated. Following 
this, The addition of new points to S and T to form sets X and V is done in a manner that 
respects the rigid edge specifications. Also, S Cl and T C y. So, clearly X and y are both 
of size I7(n). So, if ARV structure theorem was applicable to ( S,T ) then it is also applicable 
to (X, y). Finally, we have X U y = 17. So, the (X, V) partition of vectors produces a graph 
cut. 









Algorithm 2 ARV-Algorithm 



In the figure accompanying [Algorithm 2 a hypersphere is shown projected as a disc. In this 
particular figure , the line seen passing though the origin depicts the random hyperplane. In the 
later sections, we use slightly different visual conventions. The strip in pink is the fat hyperplane 
around the random hyperplane. All points he on the sphere. This 2-dim. picturization is a 
universally followed convention. In our case however, apart from the visual role, some of the 
correctness proof depend on the 2-dim projections and the related the projective and affine 
geometry. 



















Figure 11: Sections and Axes of Hyperplanes 


9.1.2. Randomized hyperplanes and 2-plane projections 


Note that the SDP solution constitutes of distribution of vectors on an n-dimensional sphere. 
In subsubsection 9.1.3 we introduce a new type of constraint called the orthogonality constraint. 
In order to see how and why the orthogonality constraint helps us model rigid edges, we need 
to consider the projection of the n-dimensional SDP solution on a 2-dimensional plane. Apart 
from the visual cues and the intuition, 2-projections provides a more convenient language for 
proof and reasoning behind the mechanism of the orthogonality constraint. To be sure, we do 
not project the n-dimensional SDP solution on the 2-dimensional plane as part of the algorithm, 
but only as part of the analysis. 

Definition 54 (Section of the Hyperplane). Given a hypersphere and an intersecting hyperplane 
that passes through its center, the section of a hyperplane is obtained by taking all points of 
intersection between the hypersphere and the hyperplane. 

Definition 55 (2-Section of a Hyperplane). Given a hypersphere, a hyperplane passing through 
the center of the hypersphere and a 2-plane, we obtain a 2-section of a hyperplane by projecting 
a section of a hyperplane on a 2-plane. 


We disregard the measure zero case when the 2-plane is parallel to the hyperplane. 
Definition 56 (Major Axis of a 2-Section of a Hyperplane). The hypersphere when projected 
on a 2-plane gives us a circle. Major axis of a 2-section of a hyperplane is the maximal length 
straight line joining any two points on the 2-section of a hyperplane. 


It is easy to check that the length of the major axis of a 2-section of a hyperplane is always 
equal to the diameter of the hypersphere. We refer to the major axis of a 2-section of a hyperplane 
simply as ’major axis of a hyperplane’. 

Definition 57 (Hypercircle). Here, we use the non-standard term hypercircle to mean a cross- 
section of the hypersphere obtained by taking all points on the hypersphere that are at a given 
distance <P from a fixed point vq. Different values of <P give us different hypercircles. 




In Figure 11, we see that two different hyperplanes may share the same major axes. The 
set of hyperplanes that share a major axis forms an equivalence class. Given a projection on a 
2 -plane, although a hyperplane is better (and yet not uniquely) represented by a 2-section, it is 
sufficient to limit our interests strictly to the major axis of the hyperplane because the behavior 
or the hyperplane can be entirely gauged from the role its major axis plays in separating various 
points on a 2-plane. In |Figure 12[ we can visualize the five different cases concerning 2-plane 
projections. To follow a convention, we will say that a point is accepted if it lies in the sam.e 
hemisphere as vo (i.e. above the major axis) and rejected if it lies in the same hemisphere as vq 
( i.e. below the major axis). 


1 . b lies on the major axis of the hyperplane and a lies under it (b may or may not be 
accepted, a is rejected). 

2 . a lies on the major axis of the hyperplane and b lies above it (b is accepted, a may or may 
not be rejected). 

3. Major axis of the hyperplane lies between b and a. b is above and a is below. (6 is accepted 
and a is rejected.) 

4. Both a and b he below the major axis. (Both a and b are rejected.) 

5. Both a and b he above the major axis. (Both a and b are accepted.) 

Now, given a rigid edge b —> a, we put two requirements on the 2-plane in order to determine 
the fate of b —> a: 


1. The 2-section of the hyperplane coincides with its major axis. This happens when the 
hyperplane is perpendicular to the 2-plane of projection. 

2. Having made the hyperplane perpendicular to the 2-plane, we still have an additional 
degree of freedom: we may rotate it along the line perpendicular to the hyperplane and 
we keep doing that till b appears on the circle (which is same as saying that the 2-plane 
Sobv o (determined by O, b and vq) is identical to the 2-plane of projection. 

Going forward we will not distinguish between the 2-section and the major axis of the hyperplane 
since we have chosen our plane of projection to be perpendicular to the line of vision. Also, 
henceforth, the major axis will be considered to be the projection of the hyperplane. 

The following lemma demonstrates how one may use edge-by-edge 2-projections to infer 
whether or not a particular edge is cut by the hyperplane in n-dimensions. 

Lemma 58 (From 2-dimensions to n-dimensions). Given two points x and y, if their 2-projections 
say x' and y' lie on the opposite side of the major axis, then then x and y lie on the opposite 
side of the hyperplane in n-dimensions. 


Proof. All these conclusions follow from a single elementary observation: Suppose we are given an 
n-dimensional sphere and suppose that a given hyperplane passes through its center. Irrespective 
of which 2-plane you project it on, the major axis will always be of the size of the diameter 
and will divide the projection into two equal semicircular arcs. In Figure 12 apart from b, the 
lengths of the segments vq — P and vq — Q are also preserved. Also, if two points say x' and y' 
lie on the opposite side of the major axis, then within the 2-projection, x' — y' will intersect 
with the major axis at a point say z'. Now, clearly, in the n-dimensions, there exists a point say 
z that lies on the hyperplane and also on the segment x — y s.t. x' and y' are projections of x 
and y. Therefore, if given any two points x and y if their projections x' and y' lie on opposite 
sides of the major axis, then then x and y he on the opposite sides of the hyperplane. □ 






Vo v 0 




Figure 12: 2-plane Projection: 5 Cases 


Example 9.1. In Figure 12 we observe that for case 1, a and Vo lie on opposite sides, for case 
2, b and vq lie on opposite sides, for case 3, {b, Uo} and a are on opposite sides, for case 4, {b, a} 
and vo lie on opposite sides and for case 5, {a, b} and vo lie on opposite sides. These cases are 
mutually exclusive and collectively exhaustive and in each of the cases, it is ensured that the 
edge a —> b isn’t cut. 


9.1.3. Handling Forbidden Edges: ARV Rounding 


The purpose of this subsection is merely to illustrate that the purpose of this section is to 
merely establish the fact that one can accommodate the orthogonality constraint within the 
randomized hyperplane rounding employed within the ARV framework. In subsection 9.2 we 
use the ARV structure theorem for directed balanced cut rounding. 

Lemma 59 (Orthogonality and Random Hyperplane Rounding). If the Orthogonality condition 
for edge (a, b) is satisfied (i.e. V{a, b ) = |uo — b\ 2 — |uo — a| 2 + \b — a\ 2 = D) and if we cut the 
unit hypersphere S with a random hyperplane V, then only one of the two possibilities can occur: 


1. ( b,a ) is cut thus ensuring that (a,b) will never be cut. 

2. Neither (a, b ) nor (6, a) is cut. 


Proof. Suppose that we have solved the SDP and all vectors obtained from Cholesky decompo¬ 
sition he in their optimal configuration on the unit hypersphere S. To understand the effect of 
separation of vectors on application of rounding by some hyperplane V, project the hypersphere 
S and V with optimal configuration of all points on the 2-plane EBo;,„ 0 which is determined by 
vo, b and the origin O. Note that since EBobuo contains both b and vq, the projections of b and vq 
will lie on the edge of the projection of S on EBo&wi(which will be a disc). Hence, b and vq will 


















Figure 13: Random Hyperplane: Orthogonality Constraint 


lie on the bounding circle that will contain projections of all the vectors on the 2-plane fflo6u 0 - 
The projection of V on will be a luie ( as l° n g as we disregard the possibility of the event 

where the random hyperplane V is parallel to the plane of projection ^obv 0 the probability 
measure of which is 0.) Since the line no — O — vq is parallel to plane of projection the 

projection of no will lie on the circle and the projection of the line no — O — no will determine 
a diameter of this circle. Thus no and v o will subtend a right angle at b. i.e. the two lines 
b — njj and b — no will remain perpendicular before and after projection since both these lines 
are parallel to the plane of projection EB obv 0 ■ But the line no — b is one of the lines contained 
in the hyperplane (say Hq) perpendicular to the line no — b. Hence the projection of Hq will 
be ho — b. The position of point a after projection will be somewhere along the segment no — b 
since the segment a — b lies in the hyperplane perpendicular to the line no — b. It may very well 
happen that a and b get projected on the same point in 2-plane EBofwo- But, this happens when 
segment a — b is perpendicular to the 2-plane EBofoo • If we consider all degrees of freedom of 
a — b, then the case of being precisely perpendicular to the 2-plane EBobuo is a measure zero 
event. So, we consider all possible positions of a on segment njj — b except point b. 


In Figure 13 we observe five different positions of a(shown as a}, a\ and a'\ and a\, a| and 
a| respectively for two different positions of b , i.e. b\ and 62 • Let us assume a position for b 
(without loss of generality) and proceed with projecting the hypersphere S on the 2 -plane ffl obv 0 
as described above. Then, any random hyperplane (denoted as a line here after projection on 
the 2-plane) will lie between lines O — AB and O — DA as shown in the figures below. So now 
we shall have five cases 











Case I: The projection of the hyperplane lies between between lines O — AB and 0 — BC. 
V(b, a) = 0 and V(a, b) = 0 (Since both a and b lie in the bottom hemisphere (w.r.t 
while vq lies in the top hemisphere and neither ab nor ba is cut.) 

Case II: The projection of the hyperplane lies between between lines O — BC and O — CD. 
V(b , a) = 1 and T>(a, b) = 0 (Since b lies in the top hemisphere along with vq and a lies in 
the bottom hemisphere and ba is cut but ab is not .) 

Case III: The projection of the hyperplane lies between between lines O — CD and 
0 — DA. V(b, a) = 0 and D(a, b) = 0 (Since all three points, vq, a and b lie in the top 
hemisphere and neither ab nor ba is cut.) 

Case IV: b lies on the projection of the hyperplane. So the point vq and the segment 
vq — b lies on the opposite side of the projected hyperplane. We have seen before that 
the projection of a lies on vq — b. Therefore, a and vq lies on the opposite side of the 
hyperplane. Therefore using the conclusions of Theorem 58[ we conclude that a and vq lie 
on opposite sides of the hyperplane in n-dimensions. Finally, b lies in the fat hyperplane. 
So, it may go to either of the two sides, while a stays opposite of vq. Hence ab is never 
cut. However, ba may be cut (if b is assigned to the same set as no) or neither of the two 
will be cut (if b is assigned to the opposite side of vq). 

Case V: a lies on the projection of the hyperplane. Since vq — a is the hypotenuse (even 
in the projected space), b and vq will lie on opposite sides of the projected hyperplane 
(since we know that projection of a lies on the segment vq — b. Therefore b and Vo are 
on the same side while a lies in the fat hyperplane. So, if a is assigned to the same set 
as {b, uq}, then neither of the edges ba and ab will be cut, whereas if a is assigned to the 


same set as vq, then edge ba is cut. Note that we implicitly use the results of Theorem 58 
to generalize our conclusions from 2-dimensional projection to n-dimensions. 


Since from our point of view angles vary only from 0 to 7r, once we reach O — DA starting 
from ) — AB we encounter repetition. So, with the above five cases in mind, we are done. In 
other words, for Case II, T>(a, b) = 1 and T>(b, a) = 0 in this iteration. For Case I and Case III, 
the decision is deferred to later iterations of divide and conquer, (i.e. while in this iteration 
neither (a, b) nor (6, a) is cut, eventually in one of the iterations, one of the two has to be cut 
for the algorithm to terminate.) Since our choice of a and b was arbitrary, this analysis can be 
seen to hold for all the specifications of the type T>(a, b) = 0 □ 


Apart from the measure zero event when a and b coincide to a single point when projected on 
the 2-plane we can always switch from the 2-dimensional reasoning to the n-dimensional 

reasoning. To see this, let V 2 be the projection of the hyperplane on the 2-plane. Once again, 
ignoring the measure zero event where the random hyperplane is parallel to the 2-plane, the 
projection V 2 will be seen as a line through the origin. 

We never really use the |Theorem 59| directly. So, whatever is discussed in this section is 
mostly to make the following section more readable. 


9.2. Extension to Agarwal et.al’s Rounding: Why forbidden edges are never 
cut? 

To begin with, note that Agarwal et al’s algorithm is a pseudo-approximation algorithm, i.e. 
Although we are supposed to find the approximately minimal directed c-balanced cut, what 

xl We always mean the hemisphere above or below the random hyperplane V unless specified otherwise. 









Agarwal et al.’s algorithm instead offers is a c'-balanced cut where d > ac for any fixed, 
a < 1. For the purpose of divide-&-conquer the (/-balanced cut serve equally well and the 
analysis for Leighton-Rao algorithm is actually identical. So we shall use the terms pseudo¬ 
approximation algorithm and approximation-algorithm (in context of Agarwal et al.’s algorithm) 
interchangeably. 


The original algorithm by Agarwal et al. and its analysis can be found in Algorithm 4 pg.7 
of [5]. Finally in the the lemma that follows, we establish the fact that the rounding procedure 
described in Algorithm 3 (i.e. the identification of sets A and B 

Lemma 60. The Agarwal et.al. style of rounding (i.e. ARV rounding modified for directed 
balanced cuts), preserves forbidden edges. 


Proof. As shown in the figures below, consider the line joining vq and the origin O. The set of 
points equidistant from vo lying at a distance on hypersphere S will lie on a hypercirckp 2 ] 
The projection of the hypercircle on a non-parallel 2-plane will be a segment. In the projection, 
this segment will always be perpendicular to line joining vq and the origin O. The dotted 
segments in green denotes distance or radius d> from vq, while the dotted segment in gray is the 
projection of the hypercircle on the 2-plane which as we see is perpendicular to line vq — O. 


Consider forbidc 
of the Algorithm 3 
which case A = |V 


en edge constraints T>(a , b) = 0 and T>(A, B) = 0. We have shown two cases 
namely when |V~| > |V + | in which case A =Ui + and when |V + | > |V~| in 
"|. In both cases, consider the gray dotted line (projection of hypercircle) to 


be the Hyper plane. 


Assume without loss of generality, that in both cases, b is above the Hyperplane i.e. assume 
that 'D(vq, b) < <h 2 and assume that T>(vq,B) > T> 2 . Therefore, in both cases, b that lies in 
A and B that lies in A. We make our argument using the visual aid of the figure alongside 
Algorithm 3 where we consider various positions for and b. These positions considered represent 


all possible scenarios and there is no loss of generality in restricting our attention to the specific 
positions in the diagram. Also, in both cases, consider without loss of generality three possible 
positions for a i.e. a 1 lying in A while a 2 and a 3 lie in A. 

As before the 2-plane of projection is the plane determined by points O, vq and b. (We disregard 
the measure zero case where segment a — b is perpendicular to the 2-plane determined by O, 
vq and b.) Hence we see b on the edge of the circle. But, by construction, edge (vq ,a) is the 
hypotenuse of a right angled triangle i.e. V(v 0 , a) > T>(v 0 , b). Hence, if b is in A then a may or 
may not be in A e.g. in position a 1 it is seen to be in A while in position a 2 and a 3 it is seen to 
be in A. Hence if b E A, we are done. 

Coming to the more important case, consider the 2-plane of projection is plane determined by 
points O , vq and B with B on the edge of the circle 13 Observe that, B G A. However, once again 


by construction, edge (no, A) is the hypotenuse of a right angled triangle i.e. T>(v 0 , A) > T>{vq , B ). 
So irrespective of position of A, i.e. A 1 , A 2 , A 3 (without loss of generality) or any other position, 
clearly the point A G A. Hence we are done. i.e. the directed balanced cut rounding preserves 
the forbidden edge condition. 


12 Here, we use the non-standard term hypercircle to mean a section of the hypersphere obtained by taking all 
points on the hypersphere that are distance <5 from vo . Each time we change the distance $ we get a different 
hypercircle corresponding to it 

13 We disregard the measure zero case where segment A — B is perpendicular to the 2-plane determined by O, vo 
and B. 










Algorithm 3 Rounding the min-DBCRE SDP (Modification of Algorithm by Agarwal et al.) 



Input: SDP Formulation of min-POP 

Output: A ^-Balanced Cut (A, A) that approximates a single subroutine of min-POP within a 
factor of 0 (-y/log(n)) 

1: Solve the SDP relaxation for min-POP to obtain a unit-Z| representation of m vectors V 0 = 


2 : 

3 : 


4 : 

5 : 

6 : 

7 : 

8 : 

9 : 

10 : 

11 : 

12 : 

13 : 

14 : 

15 : 

16 : 

17 : 

18 : 

19 : 

20 : 


Oh • ■ ■ v m } 

Apply the ARV Algorithm on the set of vectors R> to find a-large, A-separated sets U and V 
Find radius 0 such that at least half of the vectors corresponding to vertices from U lie inside 
the ball of radius <P with center at the point vq, and at least half of the vectors lie outside the 
ball (with boundary points counted inside as well as outside). 

Let Ui + = ji G U : |uo — Vi\ 2 < <£ 2 j. Let Ui~ = jf 6 U : |uo — Vi \ 2 > <P 2 |. 

Let V + = |i E V : |uo — Vi\ 2 < ^ 2 |. Let V~ = ji E V : \vq — Vi\ 2 > ^ 2 |. 

if |V+| > |V~| then 

I A = V+, B =Ui~ 

1 2 

else 

| A = Ui + , B = V- 

end if 

Choose random A > 0 

Choose a randomly s.t. 0 < a < A, and let Z = V 0 — A 
©> Handling of rigid edges 3$ Vi with Vi as the destination node. 

for a E A such that j Vi : | Vi — a\ 2 < a | do 

if $Vk G Z such that {vk Vi} G £\ then 

A 4 A + 

end if 
end for 

A = Ro - A_ 

return (A, A) 














Finally, the only modification we make to Agarwal et al,’s rounding procedure in Algorithm 3 


is 


the handling of rigid edges in context of fat hyperplanes in Lines 14 20 


this modification is identical to the one provided in Note 9.1 


The justification for 
□ 


Note 9.2 (Handling Forbidden edges via Sum of Orthogonality constraints:). From 
Theorem 60 it is clear that, if for all i £ {1,..., F} we simply add the constraint: 



to the minimum Directed c-Balanced cut formulation as applied on the extended 
Hasse Graph Tt* = (V*,£*) ; we can ensure that the rounding will either 

1. choose a rigid edges (by cutting it) i.e. avoid the complimentary forbidden edges or 

2. defer the decision to some later iteration 

and give us an 0(ydog(n )-factor pseudo-approximation at each iteration of the modified 
balanced cut subroutine. 


10. An MWUM based 0(log 2 (A/")) factor Nearly Linear Time 
Algorithm 

10.1. Arora-Kale’s Primal Dual Matrix MWUM 

The solution to MM UP using MWUM has special consequences across a wide range of problems 
in computational topology. We touch upon the most important of its applications here namely 
computation of ordinary homology, persistence homology and scalar field topology described in 

EED 


10.1.1. Multiplicative Weights Update Method 


The authors of multiplicative weights update method (MWUM), Arora, Hazan and Kale j6] 
have following to say about its general place in computer science algorithms: 

“.. .We feel that this meta-algorithm and its analysis are simple and useful enough 
that they should be viewed as a basic tool taught to all algorithms students together 
with divide-and-conquer, dynamic programming, random sampling, and the like...” 


While MWUM as an algorithmic flavor has been there around for a while, the recent work 
[6], [8] provides a more general framework encompassing several variants over different fields 


developed independently over a long period of time. See Table 5 for an analogy between the 


algorithmic primitives and an 

experts’ framework. 

Algorithm 

Analogy 

The algorithm 

A decision maker 

Weights on n variables 

Weights on n experts 

Iteration of weight updates 

Perception shift on expert supplied value 

Maximize objective 

High total payoff in long run 


Table 5: MWUM: Experts Model 





















While obviously the best decision is not known beforehand, we do know the objective function 
and there is a way to calculate the payoff in each round. This makes it possible to to eventually 
arrive at optimal decision-making by associating weights to each expert-advice, and choosing a 
prediction each time based on weighted majority of experts’ prediction. Depending on whether 
or not the prediction is accurate, the weights associated to experts are updated, in each round. 
Eventually the weights converge leading to higher payoff decisions in successive rounds. 


The simplest approach would be to attach a 0/1 outcome to the loss suffered by an expert at 
the end of round k. The next leap is to generalize the setting by allowing losses suffered by the 


experts to be real numbers in [0,1] instead of binary values. The penalty suffered by the 
i th expert in round k is denoted by 6 [0,1]. The multiplicative weights update method is 
therefore a probabilistic experts algorithm with the following steps: 


1. Let Wi be the weight assigned to the i th expert. Initialize w% = 1. 

2. The prediction will weigh in the opinions of all the experts with probability of choosing 

that specific expert being proportional to Wi. In other words, the probability of choosing 
the i th expert will be: where W is the sum of all weights. 

(fc) 

3. Finally, update all weights at the end of the round by setting Wi 4— Wi( 1 — e)» for all 
experts (assuming > 0). The generalized case where is allowed to be smaller than 
0 is treated in (Tj. 


10.1.2. Primal Dual Matrix Multiplicative Weights Update Method 

Notation 61. For the remainder of the section, we use small letters for scalars, small bold 
letter for vectors and capital bold letters for matrices. 

We shall start with the description of the combinatorial-primal dual matrix MWUM as 
outlined in Arora et.al. j§]. Accordingly, consider the standard formulation of primal min-SDP 
and its dual max-SDP: 


minSDP min C • X 


(20) 

s.t. 

Ai-X > bi 

Vz G [1... N] 

(21) 

x y 0 


(22) 

maxSDP max b y 


(23) 

S.t. 

n 

^ y,Ai A C 


(24) 

i= 1 

y > 0 


(25) 


Here y and b are vectors of the form y = {yi,?/2, ■ • •, y n } and b = {61,62, • ■ •, b n } 


Algorithm 4 Arora-Kale Primal Dual Matrix MWUM for SDP 

Input: (i.)The primal min-SDP problem instance. (ii.)Candidate objective value a (iii.)Accuracy 
parameter 5 

Output: 6-Feasible dual solution y with dual objective value > (1 — 5)a and 6-Feasible primal 
solution X with primal objective value < (1 + 5)a 
1: Set X 1 = I. Let e = 5a /2 pn. Let e = — ln(l + e). 
for i = 1 to 8p 2 n 2 ln( n )/( 5 2 C (2 do 
| if y 1 — Mod-Violation-Checking-Oracle(X 1 ) fails then 
return (y^X 1 ) 


8 

9 

10 

11 


else 


M ! = yE A jy yc+ P i j/ 2p 

i 

V M fc 

W i+1 = (1 + e)*=i = exp( -el M fc 


X*^ -1 = nW i + 1 y'xr(W‘ +1 ) 

end if 
end for 

return (y'jX 1 ) 


k=1 


By using binary search, the optimization problem is reduced to a feasibility problem. Let a 
be the binary search parameter that is passed as an input to the primal dual matrix MWUM 









Algorithm below. A subroutine called violation checking Oracle certifies the validity of the 
current iterate X 1 to declare whether it is primal feasible and has objective value > a. It should 
be noted that, the violation-checking Oracle need not point to a single violating constraint but 
may return a convex combination of constraints as an evidence of violation. The dual vector 
y* that is returned by the Oracle plays the role of choosing a convex combination of violated 
constraints. If these conditions are satisfied then the binary search parameter a is updated. If 
not, the violation checking Oracle generates a feedback vector y 1 which is used to update the 
“multiplicative weights” W 1+1 and consequently generate the next primal iterate X 1+1 . In the 
next iteration, X I+1 will be used to generate a new dual y 1+1 . This interdependence of primal 
and dual iterates makes this algorithm primal-dual. 

Note 10.1 (The Primal Dual Ideology). The rounding algorithm (in our case the ARV 
Rounding with (Agarwal et.al. + Forbidden edge) modifications) is performed directly on the 
Cholesky decomposition of the 6-feasible solution. If the rounding algorithm succeeds, then it 
produces the approximately integral solution. If the rounding algorithm fails, the failure is 
usually dramatic enough to apply rapid corrections to candidate solution by enforcing feedback 
through dual solution y. 

The convergence analysis of the algorithm uses Tr (W* +1 ) as a potential function which 
is used as a normalizing factor in computing the next primal iterate X 1+1 . The number of 
iterations required for determining whether a (as an optimal objective value) is a good guess 
or not depends on the so called ’width parameter’ p. The idea is to ensure that the SDP 
formulation is such that we can find the smallest real number p which satisfies the condition 
||A iyi — C|| < p. We shall assume that Tr(X) = n. Also, there are subtleties involved in fast 
computations involving the matrix exponential which we won’t be discussing. For these and 
other details please refer to Arora et.al [8j. 


10.1.3. The Forbidden Edges Implementation within the 
Violation-Checking-Oracle 


For adapting the min-POP problem to MWUM methods 14 we need to reformulate some of 
the equations of the vector program (min-P0Ps)| ^ Following the approach delineated in (§], 
the triangle inequalities in Equation 10| from the original formulation are replaced with path 
inequalities in [Equation 29 and the c-balanced conditions Equation 9| are replaced by the 
spreading constraints Equation 28 where a = 4c(l — c) — e and e = 1 — e). The path inequalities 
are implied by the triangle inequalities and the the spreading constraints are implied by the 
c-balanced conditions. Also, the formulation minPOPs introduces a new condition in Equation 30| 
that wasn’t a part of the (min-POPs) formulation. As discussed in js] , these conditions along 
with the additional n+1 variables therein are included to keep the width bounded. 

We are now in a position to formulate the SDP and its dual. The formulation minPOPs as 


an SDP and its dual in minPOPs2 and maxPOPs Accordingly, let v; • vj = Xij i.e. v; 


are a 


14 We have tried to cover maximum background possible while keeping the length of the exposition in mind. 
That said, for a reader wishes a more indepth understanding of the material at hand, a reading of the reference 
source [1] is highly recommended. Appendix A.2 on pg.19 which constructs Violation-Checking-Oracle is 
even more directly relevant. To keep things simple, for the reader, we shall use the same notation as they do 
wherever possible. 

15 The SDP works as a relaxation of both the min-POP as well as the min-DBCRE. However, any rounding of 
the SDP will approximate only the min-DBCRE. 
























set of vectors obtained from the Cholesky decomposition of X. Let c x j be the coefficient of 
| Vj — Vjl in the objective of the min-POP formulation above. We can now rewrite the objective 
as minC • X. Also, let D^- be the matrix representative for directed metric 'D(v- 1 , vj). The 
SDP and its dual formulation are described in lminPOPs2l and ImaxPOPsl 


minPOPs min- v i — v j 2 — vo — Vi| 2 + vo — vj 2 ) 

(( v i. v j)e£ 

s.t. 


(26) 

y, 1 Ui - v ; | 2 - v 0 - u ; | 2 + |v 0 - Vi 2 = 0 
(ui.VijeJ 7 


(27) 

y |vi- Vj| 2 > an 2 
i,j&S 

k- 1 

MS s.t. |«S| > en 

(28) 

V^l 12 ^ I, ,2 

y|Vy-V ij+1 | > ||v ai — v ik | 

3=1 

Vpaths p 

(29) 

1 12 n 

|v n +i-v n+ j| =0 

Mi,j G [1... (n + 1)] 

(30) 

|vi| 2 = 1 

Mi G [1 • • • 2(n + 1)] 

(31) 


minPOPs2 min C • X 


(32) 

S . t . 

K 

yi ^ikik • X — 0 


(33) 

k= 1 

y K s • X > an 2 

MS s.t. 5 > en 

(34) 

i,j£S 

T p • X > 0 

Vpaths p 

(35) 

Eij • X = 0 

Mi,j G [1... (n + 1)] 

(36) 

Xu = 1 

Vi G [1... 2(n + 1)] 

(37) 

X ^ 0 


(38) 


maxPOPs max Xj + an 2 zs 

i S 


(39) 

s.t. 



K, 

s® +y ]qijEij +y fjiTp v c 

ij ' S [ l ... n ] P S k= 1 


(40) 

25 > 0 

V5 s.t. |<S| > en 

(41) 

Vk>0 

Vfc G [1 ... ft] 

(42) 

fp> 0 

Vpaths p 

(43) 









Lemma 62. If the triangle inequalities (alternatively, path inequalities) are satisfied, then we 
k -1 

have Y) £>(vij, v ij+1 ) > P(v i:l ,Vi k ) 

3 = 1 

fc -1 

Proof. Consider the term Z?(vi., Vj j+1 ). Using the dehnition of directed semi-metric along 

3 = 1 

with a telescopic inequality we get, 


k -1 

k -1 



= E (l v ° ~ Vi J+i I 2 + K - Vi J+i I 2 - l v ° - I 2 ) 

(44) 

3 =1 

3 = 1 



k -1 



= E(K- v ij + i| 2 ) — v o — vq 2 + v 0 — v ik 2 

(45) 


3 = 1 



> vq — v ik 2 — v 0 — vq 2 + v 0 - v ik 2 

(46) 


II 

<! 

< 

7?' 

(47) 



□ 


Note that the objective of the 
constraint as follows: 


vector program DBCRE-PATH(VP) is written as an inequality 


E ® ( v i; v j) < a (48) 

(vi,vj>e% 


So, instead of an optimization problem we have a feasibility problem. Each iteration is a 
binary search on parameter a. Every binary search iteration involves solving a feasibility problem 
with constraints specified in Equation 48 Equation 27 Equation 28 Equation 29, [Equation 30| 
and Equation 31[ 

Lemma 63. If we can find a multicommodity flow siLch that: 


1. (Degree constraint) For any node, the total flow on all paths starting from that node is at 
most d = 0 ( 1 / 71 ), 

2. (Capacity constraints)For any edge, the total flow on all paths using the edge is at most 1, 

3. (Total flow constraint) Let fij denote the total flow on all paths from. i to j, then 

E/*P(vi,vj) > 7 
ij 


Then, either the objective constraint (specified in Equation f8) is violated or the triangle inequal¬ 
ities set of constraints (alternatively the set of path constraints in Lemma 62) is violated. 


Proof. We deduce Equation 49| from Lemma [62[ Equation 50 is merely a change of variables. 
Finally Equation 51 follows from Condition 3 of the statement of Lemma 63 (i.e. total flow 
constraint). 


E E fvli,3) V (y kWl) 

> E i’ v i) 

(49) 

V(i,j) (v k ,vi)eV(i,j) 




= E^ p ( v i’ v i) 

(50) 


l 3 

> 7 

(51) 


> 






























Equation 52 is merely a rearrangement of summations. |Equation 53 is obtained from Condition 
2 of the statement of Lemma 63 (i.e. capacity constraints). Finally Equation 54 follows from 
assuming that the objective constraint in Equation 48| holds. 


J2 J2 fv(ij)V(v k ,vi) 

= J2 J2 fvn,j)V(y i>vj) 

(52) 

V{i,j) (vk.v^g V[i,j) 

( ViVj)e£ V(i,j) 



< P ( V i’ V l) 

(53) 


<■ ViVj)&£ 



< 7 

(54) 


□ 


Lemma 64. Let S C V be a set of nodes of size ft(n). Suppose we are given for all i e S, 

vectors Vi, Wj. of length O (1), s.t. Mi, j,\\wi — Wj \\ 2 < o(l) and ll v i —v j|| 2 > l7(n 2 ). For 

i,j£S 


any given a, 


1. There is an algorithm, which using a single max-flow computation, either outputs a d - 
balanced cut of expansion 0(log(n)^ or a valid O( log ^ a )-regular directed flow f %3 s.t. 

Y.fi'PihJ) > «• 

2. There is an algorithm which using O (log n)-max flow computations outputs either: 

a) 1?(^==) vertex disjoint paths s.t. the path inequality along these paths is violated by 
17(1) or 

b) a d-balanced cut of expansion O^/logn^f or 

c) a valid 0(^)-regular direced flow f p flow s.t. 'jd fijT>(i,j ) > a. 


Proof. Please see Lemma 3, Section 4.3 of |8|. The proof is provided in Section A.2 pg.21 

of dj. ° 


We need to find a procedure to verify the following forbidden edge constraints: 


D 


1kjk 


X < a Mk € [1, k] 


(55) 


Instead we check for the single constraint Equation 56 which if satisfied implies all constraints 


m 


Equation 55 


(56) 

k= 1 

We need to do this in a manner that the width of the formulation stays bounded. 

In particular, our Oracle needs to find a feedback matrix Q that satisfies the following two 
equations: 

















Ha; is a diagonal matrix with vector x on its diagonal Equation 57 needs to be an evidence 


of violation of a convex combination of forbidden edge constraints in Equation 55 Equation 58 
is the width constraint. 

Our VC-Oracle can be described by the following steps: 


2 . 


1. Assume that the Oracle has already verified the constraints sepecified in Equation 34 


[Equation 36| and [Equation 37[ If any of these constraints are violated then a feedback 
matrix is generated in a manner identical to the Oracle for min-directed c-balanced cut as 
specified in Appendix A.2 of | 8 ]. 

Owing to step 1, the conditions of Lemma 10, Appendix B. of |] 8 ] are therefore satisfied 
by vectors v;. We pick a random unit vector u. Analogous to the ARV-Algorithm, we 
let C = {z : v; ■ u < —a} and let 1Z = {i : Vi • u > +cr}, for a = 0 ^^77 j • Therefore, the 
sets obtained C and 1Z obtained are of size L?(n) with constant probability. Subsequently, 
connect all nodes in £ to a single source with edges of capacity=degree(d), and all nodes in 
7Z to a single sink, once again with edges of capacity d. Run a single-commodity max-flow 
algorithm along with the dummy source and sink. To begin with, we attempt to find a 

O / l°g(A Q j_ re g U l ar directed flow fij s.t. ^ j) > a (which is shown to be equivalent 

' ' i,j 

to the condition of max-flow value exceeding O(alogn)). Ignoring the flow on source and 
sink edges we obtain the max-flow gives the required multicommodity flow fij prescribed 


by Lemma 64 
a) 


If we are successful in finding such a flow, then using Lemma |63|( substituting 7 = a), 
we infer that either the objective constraints specified in [Equation 48| or the set of 
path constraints specified in Lemma |62[ are violated. Moreover, because of Condition 
1 of Lemma 63 i.e. Degree constraints), the width of the formulation is bounded. 


Assuming that the condition [Equation 34| is satisifed, we choose any set S, s.t. 
Y2ijcs Ks • X > an 2 and S s.t. |5| > en}. Now, a Laplacian D of the complete 
weighted graph is constructed s.t. for ever edge eij s.t. i G 5 and j G T, we associate 
weight fij. For all other edges we assign weight 0. It is shown in Thm. 5 of [ 8 ], that 
using Ha; — D is the requisite feedback matrix. Because the flow is d-regular, the 
width stays bounded. 

It is proved in |8 j that if this max-flow has value less than 0(a log n), then the min-cut 
found in the process is an O(logn) approximation to the min directed c-balanced 
separator. 

3. If we find the requisite max-flow in step 2, then we obtain a feedback matrix and we 
are done. Else we have found a min-cut from step 2, which happens to be an O(logn) 
approximation to min-directed c-balanced separator. Let us denote this cut as C\. Now, we 
apply the same procedure as step 2 , except that now we try to find a flow that satisfies the 


b) 


path inequalities constraints from Lemma p2] along with Equation 56 Here we intend to 


apply Lemma 63 with Equation 56 as a replacement for objective constraint Equation 48 


a) Now, owing to the outcome of step 1, we already know that the path inequalities are 
satisified. So, if we are able to find the requisite flow, it must surely be an indicator of 
the fact that the forbidden-edge constraints are violated. Here we are using Lemma 63 
by substituting 7 = a. As in case of 2.(a), we construct a new Laplacian, say D 2 
of complete weighted graph. Note that, D 2 • X = )T) /jjT>(v;, vj) > a. (However, in 

ij 

this case the degree of the flow will be O(-).) 























b) If we are unable to find such a flow, then we output cut C\ as our required cut which 
approximates min-directed balanced cut (with rigid edges) up to a factor of O(logn). 
Lemma 65 (VC-Oracle). The Violation Checking Oracle from ^ can be extended to include a 
check on the forbidden edges condition violation. This check can be done in linear time in a 
manner that respects the width condition of the primal-dual formulation. 


Proof. The handling of equations Equation 34| Equation 37| and Equation 36 in step 1 and 
handling of |Equation 48 + Equation 35 in step 2 is identical to handling of corresponding equations 
in the directed balanced cut formulation of Section A.2 in |[ 8 |. So, we need to prove correctness 
of step 3. 

Suppose that we are able to find the requisite max-flow in step 3, s.t. 

1. Total flow on all paths starting from that node is at most d = 0( a /n), 

2. For any edge, the total flow on all paths using the edge is at most 1, 


3. (Let fij denote the total flow on all paths from i to j. then 22 fij'D('v- l . ~ 


> cr 


v 


then the three conditions of Lemma 63 are satisfied and we have evidence that either the path 


inequalities of Lemma 62 are violated or the constraint specified in Equation 56 is violated. But, 


from step 2, we already know that path inequalities are satisifed. So, it must be that Equation 56| 
is violated. Let be the diagonal matrix with the vector x on the diagonal. In order to satisfy 
Equation 57 we Choose Xi = a /n and Q = Z? 2 - Thus, we obtain (^/ — Df) • X < cr — cr = 0 
since D 2 • X = fT /jjP(v;, vj) > cr. Hence, S* — D 2 is our feedback matrix. Also, from Thm. 
ij 

5 of [ 8 ], we know that for a d-regular flow, we have 0 A D 2 £ 2 dl. In our case, d = a /n. Also, 
S* = ^7. Therefore, clearly, ||Sa; — Q|| < O (°/n). But since a can be chosen to be a small 
number. For our purposes, it suffices to let cr = ©(i/iogn). Also, clearly, our minimum for 
MMUP must be always at least 1, irrespective of the complex under consideration. So we always 
have a > 1. In other words a > a. Therefore, ||Sa: — Q|| < O ( Q /n) = p, satisfying Equation 58 


Now, if we are unable to find the max-flow that guarantees a forbidden-edge violation, then clearly 
the current configuration of vectors is such that almost all the forbidden edge constraints are 
nearly satisfied. We say nearly because ideally we would have preferred Di k j k »X = 0, Vk 6 [1, «]. 
a is made so small that we can be sure that for all practical purposes, for any forbidden edge 
P(v;, vj), the segments Vi — Vo and Vj — Vi are almost orthogonal. So for any cut, nearly all 
forbidden edge constraints will be obeyed. Using first part of Lemma [64] we get an approximation 
of O(logn) for min directed c-balanced cut with (nearly satisfied) forbidden edge constraints. 
Now suppose it so happens that this cut violates some of the forbidden edge constraints. For 
each of the forbidden edges, we have D(vp vj) < cr. Since, cr = O^/logn), we can ensure that 
if a pair violates forbidden edges inspite of obeying the relaxed forbidden edge constraint in 


Equation 55 then at least one of the two points vp vj must he inside the fat hyperplane. Note 


that the approximation of O(logn) for min-directed c-balanced cut is achieved through points 
that he outside the fat hyperplane and redistribution of points that lie inside the fat hyperplane 
to either of the two sets £ or 77. does not affect approximation ratio. Therefore, the points in 
the fat hyperplane are redistributed in a manner analogous to the Agarwal et al. rounding. So, 
we can use cut C\ from step 2 and we are done. □ 


The authors of | 8 1 point out a striking analogy when they say that their matrix multiplicative 
weights update method is the SDP analog of Young’s “Randomized rounding without solving 
the LP” |92j. Now, apart from the orthogonality constraint, all other conditions for the violation 

























checking Oracle are treated in exactly the same manner as directed balanced separator explained 
in Section 4.3 pg 14 of |B]. The treatment of triangle inequalities (that are rewritten as path 
inequalities in the above formulation) deserve a special mention. In effort to refute these 
inequalities one comes face-to-face with the “round-without-solve” character of MWUM. For 
this reason, we briefly discuss how single commodity flow (alternatively multicommodity flow) 
may be used to 

1 . find a combination of constraints (triangle inequalities and/or objective-function constraint) 
that are violated 

2. directly obtain an approximate solution of the SDP! 

10.1.4. From Triangle Inequalities to Multicommodity Flow 

Note that, the objective function of the SDP is modeled as a constraint: 


C • X < a 


(59) 


where a is a binary search parameter thereby reducing the optimization problem into a a 
sequence of feasibility problems dictated binary search. For every iteration, the violation checking 
Oracle must check if Equation 59 s violated. With a certain ingenuity, |8| reduce the problem of 


checking the first constraint and the triangle inequality constraints into a multicommodity flow 
problem which is specified in the following manner (pg 8. of j8]). 

1. For any node, the total flow on all paths starting from the node is constrained by at most 
d = 0( a /n ) (degree constraints) 

2. For any edge e t j, the total flow on all paths using the edge is at most Cij (capacity 
constraints) (where Ylij£E c ij \\ v i ~ v j II 2 the objective function) 

3. If fij is the total flow on all paths from i to j, then ^ || m — Vj \\ 2 > a 

Lemma 1 on pg 8. of (8| proves that there exists an algorithm which either outputs a multi- 
commodity flow prescribed above or outputs a ^-balanced cut of expansion 0(y/ log n^). If we 
find the prescribed multicommodity flow, then we have found a violation. Else we have found 
the desired cut of certain expansion that eventually helps us arrive at an 0(y/logn) ratio. The 
multicommodity flow problem has certain subtleties concerning the degree regularity condition. 
Once these subtleties are taken care of, one may solve the multicommodity flow using algorithms 
29|38 


devised in 


We are however interested in the more efficient max-flow solution described 


in Isubsubsection 10.1.51 


10.1.5. Max-flows in MWUM 

The description of the max-flow subroutine is provided in proof of Lemma 1 on pg. 
begin with one assumes that all other conditions except for the condition specified in 
and the triangle inequalities are satisfied in this iteration. Consider a random vector u. It is 
proved in Lemma 10, Appendix B of || that it is possible to And two sets L and R or size Q (n) 
s.t.: 


12 of |8|. To 
Equation 59 


\/i,j (vi - vj)-u > %/H 


( 60 ) 









for some constant a. The two sets L and R are extracted using projections on random vector 

The proofs of 


u. 


Here the procedure is analogous to one found in Algorithm 2 in section 9 


correctness are similar to those employed in Lemma 10 of |8|. 


The second part of Lemma 1 on pg 8. of j&j concerns the use of max-flows in MWUM 
computations. If the max-flow has value less than O(logn-a), then the min-cut found in the 
process is an 0(log(n)) approximation to the minimum c-balanced separator. Otherwise the 
max-flow gives the required multicommodity flow that establishes violation of a combination of 
constraints. Therefore we obtain an O(logn) approximation to 


10.1.6. Discussion on Time Complexity 


We have already observed in Theorem 44 that the reduction procedure (from MMUP to 


min-POP) takes linear time. Now we need to establish the nearly linear time complexity of 
the approximation algorithm for min-POP. To begin with recall that we use Leighton-Rao 
style divide-&-conquer by recursively solving min-directed balanced cut (with rigid edges) (min- 
DBCRE) in order to solve min-POP. We have modified the matrix MWUM from |8 to solve 
the directed balanced cut (min-DBC) problem (with rigid edges) in section 10 In 8j, pg. 3 
Table 1, the run times for directed balanced separator are listed. Also, Theorem 8 on pg. 14, 
Section 4.3 along with Appendix A.2 of |8j gives us details of how one may use polylogarithmic 
number of single commodity max-flow computations to obtain an O(logn) pseudo-approximation. 
Moreover, the computational bottleneck of this MWUM algorithm for min-DBC is the runtime of 
the max-flow subroutine. At the time when j:8| was written the (asymptotically) best algorithm 
for max-flow was (^(m 1 " 5 ) time, which allowed them to obtain an O(logn) time approximation 
in time O^n 1 " 5 ) time. But since then, 0{m ) time single commodity max-flow algorithms 
have been independently developed in |53j and |82) . If we use these new state-of-art max-flow 
algorithms as subroutines in the violation checking oracles of MWUM, we obtain an 0(m) time 
algorithm for min DBC and hence also for the min-DBCRE given the fact that we do not add 
any extra superlinear computational cost to handle rigid edges in min-DBCRE. Now, since we 
use Leighton-Rao style divide-&-conquer, we can use an elementary computation to see that 
inspite of the recursive application of min-DBCRE (on smaller subproblems), the asymptotic 
computational cost of min-POP remains bounded by 0(m). 


11. Applications 

11.1. Homology 

Computing the homology groups has several applications, particularly, in material sciences, 
imaging, pattern classification and CAPD (computer assisted proofs in dynamics). Once effective 
implementations for homology computation like CHomP (by Mischaikow’s group) and RedHom 
(by Mrozek’s group) became publicly available, problems in dynamics were routinely translated 
to problems about computing homology on cubical sets. More recently, homology is appraised 
to be a more widely applicable computational invariant of topological spaces arising from 
practical data sets of interest to a rapidly growing community of computer scientists and applied 
mathematicians [l5]. Ordinary homology groups are algebraic objects that encode geometric 
features (n-dimensional holes). Given a point cloud sample, the goal of persistent homology (as 








Figure 14: Simplicial Complex K. and its boundary operator 


opposed to ordinary homology) is the study of multiscale features of that space. The notion of 
filtration allows us to study the multiscale representation of that space. 

2 2 

Let A = J2 m i and let T = ^2 Pi. 

2=0 2=0 

Theorem 66 (Complexity of computing boundary operator). The boundary operator can be 
computed in O {Ax N) time. 

We refer the reader to 
correctness of this algorithm. 

Morse theory allows cancellation of pairs of critical cells using gradient reversals. One then 
computes a new boundary operator upon cancellation. The criterion for cancellation of critical 
cells is easily identifiable from the boundary operator. In principle (to get a further reduction 
in number of critical cells), such a cancellation routine precedes the homology subroutine and 
succeeds the boundary operator design subroutine. To keep the discussion simple we do not 
discuss critical cell cancellation here. A discussion of the newly non-smooth cancellations can 
be found found in Isubsection C.ll 


subsection C.2 for the Algorithm ^1 |Theorem 82 establishes the 


It is easy to see that under the weak Morse optimality condition (WMOC) discussed in 
Isubsection 2.4 , we can compute homology groups in nearly linear time by using the MMUP 
algorithm to reduce the complex size, combinatorial Hodge theory has been used in past to 
compute homology (instead of the standard Smith normal form) |37| . We show that using 
Hodge theory (78] , we can obtain drastic reductions in runtime of the algorithm (under the 
WMOC) [78]. 
















Algorithm 5 MMUP-based Homology Computation 

1: ©> Let /C denote the input simplicial complex and TL denote its Hasse graph Representation. 
2 : TL := HasseDiagramRepresentation(/C) 

3 : V := MMUP-DGVFCH) 

4 : T := TopologicalSort(P) 

5: A := DesignBoundaryOperator(J r , V) 

6 : A := CriticalCellCancellaton(Z\) 

7 : H(K,A) := SmithNormalForm(Z\ y i, A, A) 

Note 11.1. Any total order (compatible with the partial order specified by the DGVF) on 
cells gives an implicit discrete Morse function. To get such a function, one simply assigns the 
index of the cell in the sorted list as the value of the Morse function. This explains the use of 
topological sort in the above Algorithm. 


11.2. Persistent Homology 


We will describe how the most naive application of MMUP-APX algorithm to compute persistent 
homology leads to substantial gains in time complexity. A filtration on a simplicial complex /C 
is a collection of subcomplexes {/C(f) \ t E M} of K s.t. /C(fj) C JC(tj) when < tj. Typically, 
In persistent homology, one studies the topological invariants of the nested family of complexes 
described by such a filtration. Alternatively one studies sublevel sets of an explicitly provided 
scalar function i.e., one studies topological features of a space induced either by a function 
or a filtration. Its algebraic framework lends it generality whereas its geometric proximity to 
Morse theory gives it stability with regard to noise. Persistence gives us a natural grading 
of features at multiple resolutions and sieves out noise from features. This has led to several 
applications in shape analysis, image analysis, and data analysis |15|22|23| . As we can see from 
the Figure 15 and Figure 16 the very notion of ’holes’ is fuzzy for a point cloud dataset, in 
the sense that it is dependent on the scale at which we construct the complex. An increase in 
scale leads to gradual thickening of the space. Clearly, some holes last longer (i.e. they are 
more persistent) than others. The underlying intuition behind using persistent homology for a 
variety of applications is that noise is likely to have low persistence whereas features that are 
truly reflective of underlying topology will be more persistent. 

Definition 67 (Nerve of a Complex). Suppose that we are given a finite collection of nonempty 
convex sets S = {Uj, j E </}. Then nerve of S is given by 


FV{S) = {a C I | n jSff Uj i 0}. 

Definition 68 (Cech complex). Suppose that we are given a (M n ,d) a metric space, S a finite 
set of points in M n , and k > 0. Then Cech complex for E and k is isomorphic to the nerve of 
the collection of n— balls of radius r with centers in S. 


= {a C E | r\j ea B Vj (K) 0 }. 

Definition 69 (Vietoris-Rips complex). Suppose that we are given ( Q,d ), a metric space, 
S a finite set of points in Q, and n > 0. The Vietoris-Rips complex of parameter n of S, 
denoted by 1Z K (5), is the abstract simplicial complex whose n-simplices correspond to unordered 
(n + 1 )-tuples of vertices in A that are within pairwise distance less than 2 k of each other. 














Figure 15: Cecil Complex with Increasing Radii (parameter) 



Consider a simplicial complex /C of size n. A filtration of a simplicial complex is an ordering 
relation on its simplices which respects inclusion. Consider a function 0 : S —?• 1R. Here, S 
denotes the set of non-empty subsets of 1C. The filtration parameter 0 is monotonic in the sense 
that, for any two simplices a C 7 in 1C, 0 satisfies 0(a) < 0( 7 ) (analogous to discrete Morse 
theory). We will call 0(a) the filtration value of the simplex a. Topologically speaking, the 
monotonicity condition is a semantic necessity as it is required to ensure that sublevel sets of 
JC(q) = @” 1 (— 00 , q] are subcomplexes of 1C, for every q £ 1R. Consequently, we arrive at the 
following sequence of n + 1 subcomplexes: 


0 = /C 0 C /Ci C • • • C K n = 1C s.t. Ki = 0~ l (- 00 , Oi] 

The filtration induces a sequence of homomorphisms in the homology and cohomology groups 
which can be written as: 


0 = H P (K 0 ) H P (K 1 ) H p (lC n -i) ->■ H p (IC n ) = H p (JC) (61) 

0 = H P (K 0) <- HP(/C 1) <- • • • <- H p (IC n -i) <- H p (IC n ) = H P (IC) (62) 

New simplices are introduced one-by-one in the order prescribed by the filtration. Note that the 
filtration order (as required in persistent homology) is a total order as opposed to partial orders 
prescribed by discrete Morse functions. It can be formally proved (but can also be intuitively 
seen) that when simplex a d is introduced at time T in the sequence, it either leads to an increase 
in Betti number (3d by 1 or a decrease in (3d -1 by 1. Accordingly a d is called a positive or a 
negative simplex. Therefore, computing persistent homology of a filtration is akin to pairing 
each simplex that creates a homology feature (positive simplex) with the one that destroys it 
(negative simplex). This is analogous to cancellation of critical cells as observed in discrete 
Morse theory. While typically one outputs a persistence diagram, which is a plot of the points 
(0(a), 0( 7 )) for each persistent pair (a, 7 ) at times the underlying algebraic structures (namely 
the persistent homology modules) are the more applicable part of the output. In summary, the 
computational goal of persistent homology is to detect when a new homology class is born and 
when an existing class dies as one proceeds in the order prescribed by the filtration. 

Although, so far. we have discussed persistence only in relation to simplicial complexes, the 
same set of ideas hold while computing persistence of cell complexes. 

Notation 70 (Hf(-), HZ(-))- If a is a positive simplex in the filtration, we denote it by H(f(a). 
If a is a negative simplex in the filtration, we denote it by HZ (a). 

We denote the boundary operator of the original simplicial complex by 3 and the boundary 
operator of the discrete Morse complex by A. We process simplices one-by-one in the order 




• • 


Figure 16: Vietoris Rips with Increasing Radii(parameter) 




prescribed by the persistence filtration. At times we denote the boundary operator associated 
to the DGVF V,; by and the DGVF associated to the boundary operator by Va, • 


Theorem 13 is one of the deep theorems of discrete Morse theory. It tells us that the chain 


complex expressed in |Theorem 15| encodes the boundary operator of a new cell complex Q which 
is homotopy equivalent to the original simplicial complex 1C. The formula for the boundary 
operator is given by |Theorem 16[ We exploit this fact to do matrix operations on the boundary 
matrices of the new cell complex Q instead of doing them on the boundary matrix of the original 
complex 1C. Assuming that the number of critical cells is substantially small when compared 
to the size of the complex, this gives us a new family of persistence homology computation 
algorithms that operate directly on the cell complex boundary instead of operating on the the 
simplicial boundary matrix. The fact that we use the MMUP-APX algorithm for obtaining a 
nearly optimal DGVF makes it merely an instance in a family of (discrete Morse theory based) 
persistence homology algorithms that can directly exploit the cell complex boundary matrix. 


Note that Forman gives a criterion for critical pair cancellation as expressed in the Theorem 
below. 

Theorem 71 (Cancellation of critical pairs Forman[Fo 1998, Fo2002). / Let f be a discrete 
Morse function on a simplicial complex C, such that and t p are critical. Let there be a 

unique gradient path from da to r. Then there is another Morse function g on C with the same 
set of critical simplices except r and a. Also, the gradient vector field associated to g is equal to 
the gradient vector field associated to f except along the unique gradient path from da to r. 


is 


If a pair of critical cells (r, a) satisfy critical pair cancellation criterion , we denote it as . 
The precise details of critical cell cancellations used in the algorithm sketched in |Figure 17 
described in |77| . 

Theorem 72. The algorithm sketched in \Figure 1 7| correctly computes persistent homology 


sketch. The correctness of the algorithm follows from Theorem 12, Theorem 13 Theorem 15 


Theorem 16 and Theorem 71 1 We sketch the proof below. 


Step A: To begin with, we justify the use of the Morse boundary operator over the use of the 
simplicial boundary operator owing to Theorem 15| and |Theorem 16| The procedure for 


boundary operator construction is described in subsection C.2 This justifies Step A. 


Step B: Since the cancellation of critical cells pairs up an existing unpaired criticality (d — 1 
dimensional) with a new one (d dimensional), clearly there is a reduction in /3d-i- Hence, 
in step B, the advent of a d s.t. there exists an existing criticality r that can be paired with 
it, establishes a as a negative simplex. The recomputation of boundary operator upon 



























A. Compute zl(tr) for newly arrived simplex a d via Dyn.Prog. Complexity: O(A^ ) 


GOAL: Determine whether or H_ (it). 


Inspect A(cr) Find T d 1 s.t.^jydy. Takes nearly constant time. 


B. 3r s.t. => Reverse gradient path. 

C. s.t. =^use Afc i + a instead of using 

Recompute boundary operator A. 

9/C-+o- to perform elementary column ops. 

Complexity: 0(J\f) + A(AJ\f) 

Complexity is 0(yl 2 ). 


D. Case: H_ (cr): Use MMUP-APX to compute Vi+i 

E. Case Hf(a)\ A i+1 <- A K . +(J 

Compute A-^ i , 1 from Vi+i using Dyn. Prog. 

V Ai+1 stays 6(log 2 (A))-optimal 

Complexity: O(AAf) + 0(Af). 

No computation required. 


Figure 17: Bird eye view: MMUP-APX based persistence homology algorithm 


cancellation ensures that it is up to date for next iteration. The optimality is preserved 
because OPT(i+l) = OPT(i) - 1. In this step, we observe maximum conceptual similarity 
and overlap between discrete Morse theory and persistent homology. 

Step C: Clearly we may perform column operations on this operator instead of the simplicial 
boundary operator because both should give the same Betti numbers. 

Step D: The boundary operator is recomputed upon reapplication of MMUP. So it is up 
to date for the next iteration. The reapplication of MMUP ensures the log 2 (A7) factor 
optimality of MMUP. 

Step E: Since none of the prior simplices is destroyed, the boundary operator stays up to 
date with a single computation. The boundary operator correctness and the log 2 (AT) 
factor optimality arguments follow from induction. Also, if vector field was log 2 (AA) factor 
optimal for i, then obviously it will be log 2 (A/")-factor optimal for i + 1 because in this 
particular case, OPT(i+l)=OPT(i)+l. 


□ 

Theorem 73. The complexity of the MMUP algorithm is 0(J\f 2 )assumingtheWMOCcondition 


Proof. The MMUP-APX algorithm which runs in time complexity 0{J\f) and the dynamic 
programming algorithm with pseudolinear complexity 0(A X A f) to compute the boundary 
operator described in Theorem 16 is one of the the key computational components that drive 
down the time complexity. 


1 . The first step is to determine the existence of by inspecting A(a). Since the no. of 
critical cells of /Cj in vector field V* are within polylogarithmic bound of the optimal, the 
number of non-zero elements in A(a) is O(A). The inspection step involves going through 

























this list to find if the coefficients of one of the entries is either +1 or —1 which takes O(A). 
time. 

2. Depending on existence of^jyrff, we have two cases: 

a) When a critical pair (in the sense of discrete Morse theory) is found, the gradient 
is reversed and boundary operator A is recomputed using a dynamic update data 
structure. The time complexity of this operation is (){AJ\f). 

b) Elementary column operations performed on A which has complexity 0(A 2 ). 

3. Note that every time we encounter a negative simplex, we employ the nai've approach of 
recomputing V -;+1 from scratch using the MMUP-APX algorithm. Consequently we also 
need to compute the boundary operator of the entire subcomplex. The time complexity of 
MMUP-APX is 0(Af). The time complexity of boundary operator computation is 0{AJ\f). 
Finally, under the WMOC hypothesis, 0{AAf) = 0{M) when MMUP-APX is employed 
given the polylogarithmic approximation ratio in bounding the number of criticalities. 

Worst Case Complexity The total complexity is therefore 0{AJ\f + A 2 ). In pathological 
cases, when the WMOC is not satisfied, we have O(A) ~ 0(Af) , the complexity cost we 
incur per step is no better than the worst case which is known to be 0(Af 2 ) which makes 
the overall worst case complexity 0(A/" 3 ). 

Complexity under the WMOC condition But, under the reasonable assumption of WMOC, 
i.e. when A AT, the complexity of the processing time of one simplex in the filtra¬ 
tion sequence can be brought down to 0(Af) thereby bringing down the complexity of 
persistence homology to 0{J\f 2 ) (a sizeable gain over 0(Af 3 ) worst case methods). 


□ 

The MMUP-APX algorithm which runs in time complexity 0{M) and the dynamic program¬ 
ming algorithm with pseudolinear complexity 0 (ylx M) to compute the boundary operator 
described in |Theorem 16| are two of the the key computational components that drive down the 
time complexity of persistent homology to 0(N 2 ). 

11.3. Witten Morse functions & Scalar Field Design: Rigor, Quality and 
Efficiency. 

We develop the idea of compatibility between input scalar field and the set of all Morse-Witten 
gradient vector fields. MMFEP provides a powerful and sufficiently general modeling tool for 
obtaining approximation guarantees for closely related problems in discrete Morse theory. To 
begin with, we rigorously define what we mean by ‘compatibility’ between an input scalar field 
and the corresponding discrete Morse function. To the best of our knowledge there hasn’t been 
much of an attempt in prior research literature to rigorously formulate the notion of compatibility. 
We then optimize over the set of all possible compatible fields so as to minimize the number of 
‘spurious’ critical cells. Using the approximation algorithm for min-POP as developed in this 
paper, we obtain a nearly optimal (i.e. logarithmic multiple of the optimum) discrete Witten 
Morse function given an input scalar field. Moreover, the runtime of the algorithm is nearly 
linear. In summary, our work hopes to lend rigor and computational speed to the task of finding 
a nearly optimal discrete flat Witten Morse function compatible with an input scalar field. 

We now define the set of all compatible e-discrete Witten Morse functions (DWMF) _/”(■), 
given an input scalar field J-(-). The definition below isn’t constructive because for all cells of 




.F(ui) = 8 


Figure 18: Simplicial Complex /C 


dimension k > 0, it allows a certain range of values for the DWMF given a scalar field rather 
than fixing them up. 

Definition 74 (Compatible e-DWMFs/(■) for scalar field J r (-)). We say that a a discrete 
Witten Morse function _/”(•) is compatible with a given scalar field J-(-) if: 

1. For every dimension k > 1, let e k E R. be the k-dim. perturbation error s.t. e k > £ ] for 

every k > j. Also, V i,j , k, we must have e k < — J-(crj)\ where i,j range across all 

0-dimensional simplices and k ranges across all dimensions. 

2. To begin with, we equate the DMF f(a°) for every 0-dimensional simplex (i.e. a vertex) 
with the corresponding input scalar function F(a°) 

3. Now, inductively we let _/”( T f) = \ max J~(cr®) > ± e k where the symbol -<-< indicates 

J 

incidence of 0-dimensional cell a® on k-dimensional cell r k . 

Let crP. be the 0-dim. simplex of T k for which the maximum max J-(cr)) is attained. In this 

1 J 0 JJ h L 

case we say that r k inherits its Morse function value from cry . 

We denote compatibility by symbol =. i.e. if a scalar function _/"(■) and a discrete Morse 
function F(-) are compatible then we write it as_/”(■) = F(-). 

Example 11 . 1 . Consider the simplex with a scalar held F(-) as shown in Figure 18 The 
scalar held is given as: F(vq) = 5, F(v i) = 8, J r {v2) = 12, F(v 3) = 3 . Let e 1 , e 2 , e 3 be 
arbitrary real numbers satisfying conditions specihed in the dehnition above. In this case 
we will have /(vqVi) = 8 ± e 1 , f(v0V2) = 12 ± s 1 , /(vqVs) = 5 ± e 1 , f(v 1V2) = 12 ± e 1 , 
f(v 1U3) = 8±e 1 ,/(u2i’3) = 12d=er 1 . Similarly, we will have f (V0V1V2) = 12 ±e 2 , f (V0V1V3) = 8±e 2 , 
f(v 1V2V3) = 12 ± e 2 and/(uof2U3) = 12 ± e 2 . Finally f(voV\V2V3) = 12 ± e 3 

Possibly the most important issue in designing a discrete Morse function that corresponds to an 
input scalar held is minimization of spurious critical cells. This is where MMUP approximation 
algorithm comes in picture. So, now our objective is to find &/(■) s.t. min Y) where 


rrii are the Morse numbers and D is the dimension of the complex. 


/(■)=^(-) 0 <i<D 


It is easy to see that Dehnition 74 is equivalent to saying that each simplex will inherit a 
function value/(■) where f{vu k ) = max ± \e k — £ k ~ 1 \ . 


® k j~ 1<aK 


Consider w k s.t. f (w k ) = < max Tie r?) > ± e k . Then let be the vertex incident on 

J 

for which the maximum is attained. Now, given the structure of a simplicial complex, it is 
easy to see that x° is a vertex that is incident on all the (k — l)-dimensional faces zz ^ -1 of w 




with exactly one exception (say d^f 1 ) each time (irrespective of dimension k). For instance, in 
Example 11.1 if we look at the function value of faces of v\ V 2 'U 3 , viv 3 is the exception. In case of 
v 0 viv 3 , V 0 V 3 is the exceptional face and so on. This means that except for d 1 ^ 1 , all other faces of 
w inherit their discrete Morse function value from x°. Consequently, we may potentially form a 


gradient pair (d k 1 , w) for any of the faces d k 1 of w k (except d y 1 ) merely by perturbation of 


k-U 
X 


ak—1 

3 -“.. 3 

at the most the value e k for w k and the value e k ~ l for d k ~ 1 thereby ensuring that the gradient 
pair is compatible with Morse function value. This allows us to use the following approach. 


1. For dimension d = 1: We start with one dimensional simplices (say wj). Of the two 
incident O-dinrensional simplices, we allow matchings will one of the O-dinrensional cells 
while prohibiting matching with the other O-dimensional simplex. We also record the 
0 -dim. simplex from which it inherits its function value. 

2. For dimensions d > 1: For every simplex w k we allow matchings with all but one 
{k — l)-dimensional cell (say fl^ 1 ) s.t. w k and d ^ 1 inherit their function values from 
a 0-dinrensional simplex other than x°. We therefore prohibit the matching (d^f 1 ,w k ) 
while allowing matchings of w k with all of its other (k — 1 ) dimensional faces. 

For every dimension k > 1, the prohibition of specific matchings for each simplex zu k where i 
ranges across all simplices of dim. k, we use the notion of forbidden edges (in the corresponding 
oriented Hasse graph). The complement of these edges will be rigid edges. Note that, these 
rigid/forbidden edges are additional to those that are introduced in the MMUP approximation 
algorithm design. In other words, these rigid edges are the additional rigid edges introduced 
(apart from MR and CR rigid edges of the MMUP-APX problem). This makes the scalar held 
topology problem an instance of the MMFEP problem described above. This problem can (in a 
manner analogous to MMUP) be reduced to min-POP. This reduction followed by application of 
divide-&-conquer SDP-based approximation algorithm gives us an O(N) complexity algorithm. 
The solution will minimize the number of spurious critical cells (up to polylogarithmic factor 
0 (log 2 (iV))). 


Note 11.2. This formulation is equivalent to saying that we optimize over e-perturbations 
of all compatible discrete Witten Morse functions to obtain a nearly optimal discrete Witten 
Morse function in nearly linear time. We may say that the output is not a perturbation of but 
a precise a discrete Witten Morse function because the output of MMUP-APX is actually a 
gradient vector field and it is always possible to choose values of a discrete Morse function so 
as to reflect the additional criteria satisfied by a discrete Witten Morse function. 

An indepth treatment that includes stability and robustness properties of e-discrete Morse 
functions will be treated in a separate paper. 

The use of MMUP in context of simplicial maps and Conley index computations is deferred. 


12. Improvements & further directions. 

12.1. From Morse to Poincare: Zeeman’s conjecture and Forman’s Theory 

MMUP-APX algorithm can be used to obtain computational certificates in favor or Zeeman’s 
conjecture, which is an important open problem in pure mathematics. In fact, the Poincare 
conjecture is implied by Zeeman’s conjecture. A more careful analysis of implications of discrete 



How 

When 

Why 

What 

Ref. 

Expansions, 

Thickening, 

Deforma¬ 

tions. 

Pre/Post 

Zeeman, 

Andrew-Curtis, 

Poincare 

Conjectures 

ZC: /C 2 
Contractible 
=► 1C 2 x / \ . 


47 

66 ; 


Subdivisions 

Pre/Post 

3-dim. Topology 

Results by 
Chillingworth 


17 

18 


Iterations. 

Post 

CHomP-RedHom 

More gains in 
fewer iterations. 


46 

70 


Cancel 

Post 

0 (n ) assuming 
WMOC 

0 (n)-post 
MMUP-APX. 
Else f2 (n 2 ), 


30 

77 


Bdry. 

Pruning 

Pre 

Full removal. 
0 (n)-time. 

= Strong 
Homotopy/LC 
reductions. 


10 

77 



Table 6: MMUP: Improvements, Extensions, Further Directions 


Morse theory, especially the use of MMUP-APX based algorithms in relation to the Poincare 
conjecture needs to be done. 


12.2. Subdivisions and 3-dimensional Topology 

For a contractible yet non-collapsible complex an increase in collapsibility can potentially be 
observed under simplicial subdivisions. While heuristic algorithms are not in a position to 
harness this increase in collapsibility, our algorithm is well-placed to exploit it. An indepth 
description of this application is deferred. 


13. Concluding Remarks 


As we move into the world of modern massive datasets, we are faced with a familiar, and yet in 
context of growing sizes, a far more contingent challenge in context of computational topology 
namely to minimize the time to compute topological features. One of the most promising 
approaches is to ‘efficiently’ reduce the problem instance to another problem instance of a 
significantly smaller size (and yet with the same topology) |46|68j70|. Incidentally, discrete 


Morse theory happens to be the best reduction method in the literature. Now since the problem 
of computing the optimal Morse function is NP-hard, approximation algorithms is our best bet 
for designing provably efficient algorithms. In this work, we have harnessed tools from modern 
algorithmics to design a nearly optimal Morse function. More significantly, we do this in nearly 
linear time. In context of large datasets, a fast reduction algorithm can be the very linchpin in 
driving down the complexity of a wide range of problems in computational topology. In fact, we 
believe that an efficient implementation of the MMUP-APX algorithm (MWUM-based) can 
possibly constitute the core understructure of topological libraries. 
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Appendix A Elementary Algebraic Topology 

Definition 75 (Simplicial Complex). A simplicial complex 1C is a set of vertices and a 
collection C of subsets of vertices called faces. All faces satisfy the following property: The 
subset of a face is also a face. (i.e. B £ C,A £ B =>• A G C). Maximal faces w.r.t. inclusion 
are known as facets. The dimension of a face B is defined to be \B\ — 1. The dimension of 
the simplicial complex itself is the maximum over the dimension of its faces. 

Definition 76 (Open Cell). An n-dimensional open cell is a topological space that is homeo- 
morphic to an open ball. 

Definition 77 (Cell Complex). A Hausdorff topological space X is called a finite cell complex if 

1. X has a finite number of vertices which are essentially its 0-dimensional cells D®. 

2. X is a disjoint union of open cells {D?} where Df is an open n-cell. (i £ I where I is 
the indexing set, n> 1.) 

3. For each open cell Df there is a map <fA : B n —>• X such that restricted to the interior 
of the closed ball B n defines a homeomorphism to Df and such that S n_1 ) is contained 
in the (n — l)-skeleton of X. (The k-skeleton of X is the union of all open cells Dj of 
dimension r < k). 

4. Finally, a set a is closed in X if and only if a 0 Di is closed in Dj for each cell Df. 
Note that Di n = ^{B' 1 ). 

A cell complex is said to be regular if each is a homeomorphism and if it sends S' n_1 to a 
union of cells in the (n — 1 )-skeleton of X. 

In lay man terms, to construct a cell complex you start with points P®, then glue on lines T>j 
to P?, then glue discs P ? to T>j and P? and so on. Therefore a cell complex is a topological space 
constructed from a union of objects called cells, which are balls of some dimension, glued together 
on boundaries. Cell complexes are the most convenient object to do algebraic Topology. But to 
simplify the discussion, we will instead provide a basic presentation of simplicial homology. 




Figure A19: Need to study Homology: Classification of shapes such as those shown above 












Figure A20: Principal algorithmic problem: Find Homology groups (i.e. n-dim. holes) of 
discretized manifolds above 


























Figure A21: Dim 1 boundary operator 



Notation 78. Boundary &; Coboundary of a simplex a: We define the boundary S a and 
respectively coboundary 5 a of a simplex as 
5 a = {t | r -< a} 0 a = {p \ a -< p} 


Homology groups are the most important and general topological invariants of simplicial and 
cubical complexes, that are also computationally feasible. At the heart of it, algebraic topology 
is essentially the use of linear algebra to compute combinatorial topological invariants of a 
given space. Given a simplicial complex W, we can define simplicial g-chains, which are formal 
sums of g-simplices ciiSi where the a* are the elements of an abelian group (e.g. integers, 

rationals, finite fields etc.). Furthermore, for each q, the sums of g-simplices under addition 
form an abelian group known as the chain group and denoted by C q (W, Z). The n-simplex 
A = {wo,wi, • • • ,u n }with standard orientation is denoted as + [uo,ui, ■ ■ ■ ,v n \. Consider the 
permutation group of n-letters on the vertices of A. The set of permutations fall into two 
equivalence classes: even permutations and odd permutations. The set of even permutations 
induce the positive orientation + [no, v\, ■ ■ ■ , v n \ whereas the set of odd permutations induce 
the negative orientation — [uq, iq, ■ ■ ■ ,v n \. 


For each integer q, let C q (W) be the free abelian group (i.e. the chain group) generated by the 
set of oriented g-simplices of W. Let W q denote the total number of q —dimensional simplices 
belonging to the simplicial complex W. Then, one can show that C q = Z Wq . 

For each q, the boundary map d q is defined to be the linear transformation d q : C q —> C q -±. 


Examples of such operations are given in 


Figure A21| , Figure A22 and Figure A23 


This map gives rise to a chain complex: a sequence of vector spaces and linear transformations: 


o-> a 


. ^ C n -1 


"< 7+2 


C q+1 (W) C q (W) 


. -A CxiW) W Co(W) 









It can easily be proved that that for any integer q, 

d q o d q+ \ = 0. 


In general, a chain complex C* = {C q ,d} is precisely this : a sequence of abelian groups (C q ) 
connected by an operator d q : C q —>• C q -\ that satishes d o d = 0. 

If one dehnes 

Z q = ker d q and B q = irn d q+ \, 

then it follows that B q C Z q . Elements of Z q = kerdg are called cycles, and elements of 
B q = im<9g-|_i are called boundaries. Likewise, Z q = kerdq is called the q— th cycle group and 
B q = imclq+i is called the q— th boundary group. Then the homology group H q measures the 
equivalence class of cycles by quotient-ing out the boundaries i.e. this construction measures 
how far the sequence is from being exact. 

The (/-dimensional homology of W, denoted H q (W ) is the quotient vector space, 


Hq(W) 


Zq(W) 

Bq{W )' 


and the g-th Betti number of W is its dimension: 


/ 3 q = dim H q = dim Z q — dim B q 


Appendix B Extra Pseudocodes and Proofs 

B.l Pseudo-FFT gadget 

A more fine-grained description of pseudo-FFT pseudocode is provided in Algorithm [6] and in 
Algorithm [ 7 ] 





C. Hasse graph Hi after edge pair isolation and cloning of vertices 



C. Hasse graph H3B .2 on addition of pseudo-FFT rigid edges 


Figure B24: FFT mimetic gadget edges 
























Algorithm 6 pseudo-FFT-subroutines 
1: procedure CREATeFFTnodes(,c/[1 ...N],N) 

2 : J-[0][1 ... N] := s/ T [l ... Nj; Vi, £(-F[0][?:]) := createLabel(.F[0][*]) 
3: <S[0][1... N] := srf B [l... AT]; Vi, £(<S[0][*]) := createLabel(«S[0][i]) 
4 : Rc[ 0] = N] j = 0 
5: repeat 

6 : j := j + l; N:= W/ 2; R c \j] := N 

7: F[j\ [1 ... N] := createNodes(A) 

8: S[j] [1 ... N] := createNodes(iV) 

9: until N > 2 

10 : Re [j] = 2; Ml ■= j 
11: return F,S,Nl 
12: end procedure 

13: procedure createTOedgeLevel('P, Q, N ) 

14 : i := 1, j := 1 

15: repeat 

16: create Edge Qj); j := j + 1 

17: if j < N then 

18: £(Pi) ■= createLabel(£(Qj_i),£(Qj)) 

19: creatEdge(7 ? j, Qj); j := j + 1 

20: else 

21: | C(Vi) := createl_abel(£(Qj_!)) 

22: end if 

23: i:=i + 1 

24: until j < N 
25: end procedure 

26: procedure createFROMedgeLevel('P, Q,S*,N) 

27: i := l,j : = 1 

28: repeat 

29: createEdge(Rj, Qj); i := i + 1 

30: if i < N then 

31: £(Qj) := createLabel(£(P,_i), £(R,;)) 

32: creatEdg e(Vi,S*{C(Vi)) 

33: creatEdge(Rj_i,5*(£(7 : ’.j_i)); i := % + 1 

34: else 

35: | C(Qj) := createLabel(£(R,_i)) 

36: end if 

37: j ■■= j + 1 

38: until i < N 
39: end procedure 






Algorithm 7 pseudo-FFT-gadget 
l: procedure createTOedges(S, Rc,Ml) 

2: k = 1 

3: repeat 

4: | createTOedgeLevel(5[A;] [*], S[k — 1][*], Rc{k — 1)); k := k + 1. 

5: until k < Ml 
6: end procedure 

7: procedure createFROMedges(J 7 , 5, Rc,M l-) 

8: k = 1 

9: repeat 

10: | createFROMedgeLevel^fA;]^], + 1][*], «S[fc], Rc{k)); k := k + 1. 

ll: until k < Ml 
12: end procedure 

13: procedure pseudoFFTGadget 

14: := createFFTnodes^, N). 

15: createTOedges(5, Rc,M l)- 
16: createFROMedges(J 7 , S,Rc,Ml )■ 

17: end procedure 


Appendix C Results referred from concurrent works 

C.l Smooth and Non-smooth cancellations 


Definition 79 (Ridge, Ridge Critical Simplex). We define a critical simplex a p+1 to be ridge 
critical if there exists at least one simplex (3 P s.t. j3 -< a and another simplex 7 P_1 s.t. 7 -< (3 
where (7, j3) are a gradient pair. When such an a is ridge critical, we say that its ridge occurs 
at (3. A critical simplex that isn’t ridge critical is referred to as smooth critical. 

See |Figure C 25 for an example and a counterexample of ridge critical simplices. 

Theorem 80 (Non-smooth Critical Simplex Cancellation). Let f be a discrete Morse function. 
Let a p+l be a ridge critical simplex with a ridge at t p s.t. there exists a simplex 7 P_1 with (7, r) 
being a gradient vector pair belonging to the gradient field associated to function f. Suppose 
that the gradient field V/ associated to f satisfies the following conditions: (a.) there do not 
exist any gradient paths from da to r and (b.) there exists at least one critical simplex a p with 
a unique gradient path from da to 7. Then, we can obtain a new Morse function g on 1C with 
same set of critical simplices except a and a (which are cancelled). Also the gradient vector field 
associated to g, namely V g associated to g is equal to the Vf except (a.) along the gradient path 
from da to 7, which will be reversed, (b.) the gradient pair (7, r) which will be deleted and (c.) 
the gradient pair (r, a) which will be newly introduced. 

Proof. To begin with a p+l and a p are critical. Since we delete the gradient pair (7, r), 7 P_1 amd 
t p also become temporarily critical. Since by hypothesis, there do not exist any gradient paths 
from da to r in Vf, adding gradient pair (r,a) does not create any closed orbits in gradient 
vector field (equivalently any cycles in the corresponding Hasse graph). Also, by hypothesis, 
since there exists a unique gradient path from da to 7, by T Theorem 1 | we do not create any 















closed orbits in the gradient vector field by reversing this gradient path, thereby cancelling a 
and newly created critical simplex 7 . Having done these operations we obtain V y . Since V/ 
had no closed orbits to begin with and since we do not create any orbits due to our operations, 
we can say that the newly formed vector field V g is a gradient field with critical simplices a 
and a cancelled. To this field, we associate a new function g by attributing arbitrary serially 
increasing values in M to any total order that corresponds to the partial order imposed by V g on 
all simplices belonging to 1C. □ 


We refer to cancellations described by |Theorem 80 as non-smooth critical simplex cancel¬ 
lations in contrast to cancellations described by Theorem 1 due to Forman |30 34 as smooth 
cancellations. Please see C.l for examples of smooth and non-smooth cancellations. 


C.2 Boundary Operator Computation 

Theorem 81 (Boundary Operator Computation Forman |30j). Consider an oriented simplicial 
complex. Then for any critical (p+1 )-simplex j3 set: 

d/3 = E p »h a 

critical a(p) 

Pag = E N(j) 

~ 1 &r(P,a) 

where T(/3,a) is the set of discrete gradient paths which go from, a face in 0 j3 to a. The 
multiplicity N( 7 ) of any gradient path 7 is equal to ±1 depending on whether given 7 the 
orientation on f3 induces the chosen orientation on a or the opposite orientation. With the 
boundary operator above, the complex computes the homology of complex K. 

Theorem 82 (Boundary Operator Computation: Correctness Proof). The Algorithm correctly 
computes boundary operator A. 













Algorithm 8 Boundary operator computation 


1 

2 

3 

4 

5 

6 

7 

8 
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10 

11 

12 

13 

14 

15 

16 


procedure calcBdryOp(.M 2 , PL, X) 

topologicalSort('H, X, C, 'ASCENDING'); 

Aoy = 0; 

for 2 < i < |£|; a := C[i\ do 
if (j -< (5 8c (a, ( 3 ) E X then 
Act =< dp,a > xA/ 3 ; 

else 

Let Ti a be the set of regular cells incident on a s.t. 
Let on ~< a be the set of critical cells incident on <r; 

Act = A77x < da, Ti > +Y( a i x < da, ai >; 

end if 

if cr-pair = NIL 8c cr-revPair = NIL. then 
A c cr := Act; 

end if 
end for 

end procedure 


(Ti,a) (£ X; 


Proof. Note that, to begin with we start with a list of cells in an ascending total order. Let us 
call this list C. This total order is one of the total orders that is compatible with the partial 
order prescribed by the gradient vector field y. If we assign the function value ’i’ i.e. the 
index of some cell C[i] to each cell in C, we essentially obtain a Morse function compatible 
with the gradient vector field. The first cell we process is one with the lowest function value 
(i.e. the unique minima). This cell is then followed by cells with increasingly higher Morse 
function values. To prove that the formulaic computation of the A operator as expressed 
in subroutine calcBdryOpQ is, in fact, the same as expressed in Theorem Theorem 81 


we 


proceed by induction. Let a 1 denote the unique minima. The base case of induction for A a\ is 
trivial. Now suppose that for all cells in the set {ay, 02 ,... ay}, we have correctly computed 
the boundary operator as prescribed in Theorem |Theorem 81| Now suppose we encounter cell 
oy_|_i. Suppose that oy + i has a lower valued coface /3 i.e. (oy+i -< (3 8c (aj + \, /3) E y). Since 
P has lower function value as compared to ay 41 (by hypothesis), we conclude that P = aj + 1 for 
some J < I. All paths emanating from oy + i must go through p. The orientation induced by 


some path 7 i P p from P to some critical cell say p is l where t = ± 1 , then the orientation 


of path (7/4.1 p will be {dp, oy + i) x 1. Therefore, the total count of paths (with induced 

orientation accounted for) will be (dp,a) X A p. Hence, the boundary operator computation 
done in calcBdryOpQ is valid for the case when oy + i has a lower valued coface. Finally, 
assume that oy 41 does not have any lower valued coface. Therefore, the flow leaving from oy + i 
will be through each of its faces (except possibly one higher valued face). If it indeed has a 
(matched) higher valued face then flow will be entering it through that face and hence the face in 
question isn’t relevant in calculating the weighted sum of gradient paths that leave oy + i. When 
consider lower valued faces of ( 7 / 41 , we make a distinction between faces that are non-critical 
and those those that are critical. If a face say a.j is critical, then clearly we are justified in 
directly including the entry ay X (da, otj) as part of our formal sum that makes up the cell 
boundary. As for the non-critical entries of the formula, namely [Ar^x < da,Ti >], we impose 
an additional constraint E Xn ( as opposed to (£, 77) E Xn-i) in the summation. In doing 











so, we are ruling out all entries that would valid directed paths going out of o / +1 but those that 
won’t add up to make gradient paths as prescribed by Theorem Theorem 81 Now since r* is 


lower valued its boundary At* has already been calculated correctly by Induction Hypothesis. 
But clearly every gradient path emerging from a / +1 must first pass through one of these rfs. 
Also, for each of these gradient paths, the orientations will change precisely by the multiple 
of Therefore the weighted sum of (non-trivial) gradient paths from 07+1 will be the 

sum of all the contributions by boundaries of each of the non-critical faces Tj. To complete the 
argument for the induction step, we note that these sums along with contributions from the 
critical faces of er/ + i takes into account each gradient path precisely once. Also, it is easy to see 
that multiplication by co-orientation at each step provides the weights to ensure that the final 
entry will decide the induced orientation. Hence proved. □ 


Theorem 83 (Complexity of Computing Boundary Operator). The complexity of computing 
the boundary operator is 0(T x A I) where T is the number of critical cells and O(N) is the size 
of the complex. 


Proof. For the Hasse graph TL(V, £) of a simplicial complex, £ < V x V where V is the maximum 
dimension of cells in the complex (which in our case is 2). Therefore, \£\ = 0(|V|). (It is easy 
to show that for a cubical complexes as well, number of edges is 0(|V|)).The complexity of 
computing topological sort of the oriented Hasse graph is 0(|V| + |£|) which is same as 0(|V|), 
assuming that our input manifold is either simplicial or cubical. The for loop in Lines |4]|15| 
of procedure calcBdryOpQ costs at the most O(S) per iteration while the total number of 
iterations is 0(|V|). Therefore, the total cost of the for loop is 0(|V| x T). Therefore, complexity 
of computing boundary operator is 0(|V| x T) = 0(A/" xT), since the number of vertices in 
the Hasse graph is same as number of cells in the complex (i.e. the size of the complex namely 

M). □ 

It is worth noting that in vast majority of the practical scenarios M T, enough for us 
to assume that compared to the size of the complex, the ’topological complexity’, T is nearly 
a constant. We therefore use the notation O(-) (where 0(Y x A f) = to indicate the 

pseudolinear time complexity of boundary operator computation. 


C.3 Boundary Pruning 

C.3.1 Boundary Pruning ~ Strong Collapses ~LC-reductions 

Definition 84 (Boundary face of a Simplicial Complex). Let 1C be a simplicial complex. If 
there exists a simplex a E 1C s.t. the no. of cofaces incident on a equals 1 , then we call a the 
boundary face of the complex 1C. 

Definition 85 (Boundary Pruning). Let 1C be a simplicial complex. For every 1 < i < n, let 
rjf be a maximal simplex with one or more boundary faces. Let gf^ 1 be of one of the boundary 
faces ofrjf. We define boundary pruning to be the process of successive deletion of pairs 
{gi,r]i} until we reach a point where we do not have any simplices with boundary faces left. i.e. 
1C = 1C 0 , /Ci = /Co — {f?i, ??i}, ..., /Q = ICi- 1 — {gi, r]i}, ..., /C n s.t. lC n is a complex without 
boundary faces. The dimension of a pair {y d , g^ 1 }, is said to be d. We delete all simplicial 
pairs (satisfying boundary criterion), from dimension d = D to d = 1, decreasing dimension 
each time the pairs of that particular dimension are exhausted. 



Theorem 86. For a simplicial complex 1C D if there exists a vertex v\ that dominates another 
vertex V 2 , then for every maximal simplex \ d s.t. 1 < d < D that is incident on both v\ and V 2 , 
we have a boundary face 'y d ~ 1 incident on V 2 s.t. 7 -< A. For d = 1, vertex V 2 is the simplex 7 
we seek. 


Proof. First consider the case where 2 < d < D. Assume that v\ dominates V 2 - Consider a 
maximal simplex X d incident on both v\ and V 2 - It is easy to see that if d > 2 then \ d has at 
least three faces. Let 7 be the convex hull of all vertices of A except vertex v\. By Theorem 3 
7 is then ad-l dim. face of A i.e. 7 -< A. Now, suppose 7 is not a boundary face. Therefore, 
it has more than one cofaces of dimension d. Let k be a cofaces of 7 s.t. 7 -< n, 5 7 ^ A. k is 
then the convex hull of vertices of 7 along with an additional vertex v x s.t. v x ^ A. But, since 
V2 Q 7 Q k, we have a maximal simplex k that contains V2 but not v\ . This contradicts our 
assumption that v\ dominates V 2 - Therefore, 7 must be a boundary face. Now, consider the 
case when (dimension of maximal simplex) d = 1 and v\ dominates V 2 .Therefore, v\ and V 2 
share an edge (a 1-dinrensional simplex) but no other higher dimensional simplices. Suppose 
that there exists an edge incident on V 2 that is not incident on v\. But, then this would mean 
that v\ does not dominate V 2 which contradicts our hypothesis. Since {v\,V 2 ) is the only edge 
incident on V 2 and since the edge {v\,V 2 ) is a maximal simplex, clearly V 2 is a boundary face, 
i.e. V 2 = 7 when d = 1 □ 


Once we are done with boundary pruning across all dimensions, there won’t be any boundary 
faces (of any dimension) left. Furthermore, from Theorem C.3.1 we may conclude that there 
does not exist a vertex in the boundary pruned complex IC n that dominates another vertex in 
K. n . In other words, one may use a boundary pruning algorithm to obtain the core of the complex. 
Please see C.3.1 for the pseudocode. 


The notion of linear coloring (LC) was introduced in 119J. While |l9j uses representative 
subcomplexes (one vertex per color) to perform reductions, Matousek [64| uses link-cone 
reductions and proves that the link-cone reductions are equivalent to linear-coloring reductions 
introduced in 1191. Finally in 1101, on pg.76, Barmak notes that the notion of linear coloring- 
reductions is equivalent to the notion of strong collapses (which are essentially of the link- 
cone form) introduced in |11|. The existence and (uniqueness up to isomorphism) of cores is 
independently proved in 111 and |64||. Since one may use either strong collapses or linear coloring 
reductions or boundary pruning reductions to obtain the core of a complex, we may therefore 
arrive at the elementary conclusion that Boundary Pruning ~ Strong Collapses ~LC-reductions. 
The boundary pruning algorithm is described in |C.3.1 


C.3.2 Pseudocode for boundary Pruning 

To obtain the core of input complex /C, we do the following: 
For decreasing values of d s.t. D > d > 1 
Invoke subroutine pruneBoundary(d, /C, B d , F) 








Algorithm 9 Boundary Pruning Subroutine 
1: procedure FlNDBoUNDARY(/C d ) 

2: Scan through list lC d ( i.e. d-dim simplices of 1C). If a simplex wf = lC d [i\ has a face C} d ~ l 

such that s the sole coface of 'd{ k , then add to 13 d ] 

3: return B (l = [va e lC d | w has at least one boundary face.}; 

4: end procedure 

5: procedure addPairToVF(t, r?, V, 1C, B d , d) 

6: If r / NIL and if r isn't already matched then do the following: 

7: (a.) Match r to 0 and add {•&, r) to vector field iF. 

8: (b.) Delete i) and r from /C^ 1 and /C d respectively. Also, if r G 13 d then delete r from list 

B d . Finally, enqueue r in Q. 

9: end procedure 

10: procedure PRUNEBoUNDARY(d, 1C, B d , Y) 
ll: w = dQ(B d ) 

12: repeat 

13: Set firstFlag to 'T'. 

14: repeat 

15: F(w) := faces(tz7); Fb(vj) := bdryFaces(zi7); 

16: | | if firstFlag is 'T' & v £ Fb{vj) & v ^ NIL. then 

17: addPairToVF(ti7, u, y ,K,,B d ,d). 

18: end if 

19: Set firstFlag to ’F’. 

20: | | for each £ ^(w) do 

21: If di has exactly two cofaces -w and //;, then do the following: 

22: addPairToVF(|Uj, Y, 1C, B d , d). 

23: end for 

24: until {vj = dQ(Q)) / NIL 

25: until (w = dQ (B d )) + NIL 

26: end procedure 






